SUGI 28 Summary

In Short

Sections of this report:

Opening Session

I can't say that the opening session was disappointing, but only because SAS Institute has managed to set expectations so low. Here are some comments from attendees:

The SAS Intelligence Value Chain was introduced. We saw that diagram a lot over the next three days. It has 5 links:

  1. "Plan is a set of proven, best-practice roadmaps that are supported by integrated industry data models, project methodologies and consulting expertise that reliably create customized solutions."

    Bad grammar aside and buzzwords aside, that phrase does have some meaning, but nothing that SAS Institute discussed over the next few days did a good job of showing off their proposed solutions. Perhaps I just attended the wrong sessions.
  2. "ETL(Q) raises the typical function of ETL to the power of Quality. SAS ETL is an integrated ETL platform that synthesizes corporate data from operational silos of information on any platform and in any format."

    (Unless you're using Ingres, which becomes unsupported in version 9, or unless you're on a Macintosh, of course.) There is some meat in this item, but it's a bit hard to ferret out. ETL stands for Extract, Transform, and Load, the data warehousing term for what the rest of us would call "reading data". It's supported by the new ETL Studio product, part of the Business Intelligence Suite. ETL Studio is the replacement for the Warehouse Administrator, and provides what appears to be a seamless way to read in data from various sources. It can also schedule programs to run, and it maintains metadata about what it's done - more about that under Base SAS. I don't know the pricing, but it sounded like a reasonable product.
  3. "Intelligent Storage is a dedicated platform designed from the outset to efficiently disseminate information for both business intelligence and analytic requirements."

    I don't know what this refers to. "Platform" usually means "hardware", but SAS Institute is not in the hardware business. It apparently refers to their SPDS product, or to the better indexing they've incorporated into version 9, or to the new OLAP server, or to their emphasis on scalability. Or maybe it refers to the metadata server.

    This might have been made clear in the Business Intelligence Suite brochure, but by the time I made it to that corner of the demo room (and it was literally the farthest corner from the entrance) they were out of product literature. This seems like poor planning.
  4. "Business Intelligence is a set of out-of-the-box, enterprise-wide ad-hoc query and reporting capabilities for different types of users."

    No buzzwords, and some interesting products displayed. Report Studio and Web Report Studio seem to be the relevant new applications. I didn't get a chance to experiment with them on my own, so I can't say how flexible and powerful they are, but the demo looked good. Creating reports requires some advance setup of the data (so "out-of-the-box" might not be exactly correct), which would typically be done with the ETL Studio product. These new applications are also part of the Business Intelligence Suite, and appear to be aimed at non-programmers.
  5. "Analytic Intelligence is a dedicated, integrated platform for analyzing the past, present and future business scenarios to drive sound business decisions."

    I guess having two sentences in a row with clear, meaningful content and no grammatical errors was too much for them. I think what they mean by "analytic intelligence" is "statistics", as the sessions we were directed to for more information dealt with statistics and data mining, but "statistics" is probably too scary a word for their intended audience.

    It's certainly true that many, perhaps most, businesses don't analyze their data properly. It takes too much time, requires too much expertise, and costs too much money. The Analytic Intelligence products are designed to address the first two problems.

The SASWare Ballot was mentioned, but no particular items were named as being addressed in version 9.0 or 9.1; as someone put it, "There's information in that, isn't there?"

The opening session was scheduled to last for 1 hour, and had about 1 hour's worth of information, but took 2 hours. Those in the audience who were still left at the end (and quite a few left in the middle) were happy to see it be over.

SAS Institute needs to think carefully about who its target audience is for the Opening Session. Is it the 3,000 programmers and analysts sitting in the back of the room? If so, then they need to address our concerns and not ply us with buzzwords. Is it the 50 people attending the executive track sitting at the front of the room? If so, SAS Institute shouldn't invite the rest of us - make us unhappy enough times and the executives will realize that something's not right.

It's sad to see how they managed to take a good message and mess it up. The concepts and implementation of the Intelligent Value Chain are good, and the problems it claims to solve are very important ones, but they failed to convey any of that.

Oh, the entertainment was not specific to the locale, as it often is. We were treated to The Amazing Kreskin. These reviews should give you the flavor of the entertainment:

Apparently no one told The Amazing Kreskin (his legal name) what SUGI stood for, as some of his remarks were clearly designed for a US audience, not an international one. And apparently no one thought that someone who had predicted that Nevada would be flooded by UFO's last summer might not be the best representative for a software product that prides itself on its ability to "drive sound business decisions".

As usual, there was a little skit. In the story, one of the presenters (I think it was Jim Davis, chief marketing officer) had gone to an outdoors store to buy a pair of the hiking boots he had seen advertised on sale, but the store had sold out. How could such a thing happen? Well, because the store's data resided in a zillion sources that weren't tied together, and no one was able to get information fast enough to make intelligent decisions about what to buy or which sales campaigns were effective. (Many of us are familiar with variations of that scenario).

SAS Software to the rescue, of course. The rest of the skit showed how various SAS products (ETL Studio, Web Report Studio, and so forth) could be used to solve the company's information problem and make it more profitable. At the end, in a bit of unfortunate symbolism, we saw the sun setting on the corporation's headquarters after a full line of SAS products had been installed.

Not a bad idea, but it was not always clear which products were being used, and there were many pauses when nothing happened. As a drama, it didn't do well, and as marketing, it was not appropriate for the audience.

A video of highlights of the Opening Session is available at http://support.sas.com/news/feature/03mar/sugivideo.html.

But enough of that... on to what's new and different.

Version 9.1

Version 9.1 will be available to existing beta testers sometime in August, and for general release sometime around the end of the year. Version 9.2 will be available sometime next year. I heard no mention of a version 10.

Someone at the Futures Forum asked why the release has been delayed and the answer was that SAS is comfortable with the code but not with the deployment.

What this means, apparently, is that all the pieces work individually, but not together. They'll spend the next 3-6 months getting all the parts to talk nicely to each other, and making minor changes and improvements.

Base SAS

SAS/Connect

MultiProcess Connect has a few new features and is better documented. MP Connect lets you run several parts of a program simultaneously.

The piping libname engine allows you to pass data between two steps using sockets rather than a file. This lets the program start faster and reduces I/O.

Both MP Connect and piping look complicated, because the documentation shows all the possible options you might ever use. But in practice, they're easy. The difficult part is knowing when to use them

Data Step Functions

There were dozens of other functions added in 9.0. Many of them deal with finding strings inside other strings, such as unprintable characters within text. They're in the online version 9 documentation at <http://v9doc.sas.com/sasdoc/>.

ODS

Lots of good new things in ODS for 9.1:

If you're interested in ODS, I strongly recommend that you go to the ODS developers class before SUGI, informally known as the ODS Geekfest (no, it's not for beginners, but you don't have to be a super-wizard either). I wish SUGI had more developer-led sessions.

Reading and Writing Excel Spreadsheets

"How do I read (or write) an Excel spreadsheet from SAS?" seemed to be the most popular question at SUGI. The two sessions I attended on the topic were packed - Chevell Parker's parker was in the largest room (the one used for the closing session) and still had dozens of people sitting on the floor; a session coordinator estimated 500 attendees. David Shamlin's paper was in a smaller room, but it was packed - every square foot of the room that could see the screen had someone sitting or standing in it.

Installation

The installation process will be much better. There will be three methods:

There's also a new, and hopefully less confusing, place to start from, the SAS Software Navigator. Disks will be more clearly marked.

They're working on an installation validation tool and a feature testing tool.

Migration

SAS is spending a lot of effort to make sure that the conversion to version 9 goes smoothly. There's a new PROC MIGRATE which will copy data sets of all types from one version to another. It can copy from a 32-bit library to a 64-bit library on the same operating system, and it copies indexes, integrity constraints, audit trails, compression, encryption, passwords, and generation data sets.

Incidentally, the statement "If you are using AIX, HP/UX, or Solaris platforms and have 32-bit members in your libraries, you will need to migrate your libraries forward. SAS 9 supports 64-bit access only on these platforms." appeared in various places over the course of the conference. The second sentence is not worded correctly; it should be "SAS supports only 64-bit access on these platforms." In other words, that is not an exclusive list of 64-bit platforms. Elsewhere it was said that all 32-bit Unix support is being dropped.

Servers

SAS is rearranging its line of servers to simplify maintenance. I think this is a good thing. Metadata, for example, will always be handled by the metadata server, no matter which product (OLAP server, stored process server, whatever) created it. You define it once, and the information is stored and can be used in many places. Partly as a consequence of that, the metadata server will be part of Base SAS.

Also new is a SAS Management Console, which provides administrators with a single interface to many of the SAS servers and features. Among the objects which can be managed are:

The Open Metadata Server
A central handler for all metadata. If you've struggled through the process of creating a cube on a Unix box and then viewing it from Enterprise Guide, you know why this is a welcome change.

Another new feature of the Open Metadata server will be the ability to control user access down to the column level.
The Workspace Server
This seems to be the replacement for the IT or IOM server; it provides a connection to SAS data through a TCP/IP connection from Enterprise Guide, AppDev Studio, and some other products which query data and submit code.
The Stored Process Server
This is relatively new, and I don't completely understand it (not that I completely understand anything in SAS; I understand this even less). The idea is that you can define a stored process, which is a SAS program with parameters, and then call it from various places (this is similar to a stored procedure in many database programs). For example, you might set up a stored process which takes a client name as its parameter and returns all the client's addresses as its result. That stored process could be called by SAS/Intrnet, or Enterprise Guide, or Office Integration, or one of the report writers in the Business Intelligence Suite.

I'm not sure how the stored process server is bundled. I suspect it's not part of base, but I don't know whether it's a standalone product or bundled with SAS/Intrnet or the Business Intelligence Suite.
The Data Storage Server
I'm not sure what this is. It might be a new name for the SPDS server.
XMLMaps
Schedules
Users and Authorizations

Other services provided by the management console are management of users and groups, management of server definitions, and reporting of installed SAS products and license information.

There are still lots of servers besides the ones listed above - the SAS/Share server, the SAS/Intrnet servers, SAS/Connect servers, the new SAS Access to PC File Formats for Unix server, and probably some others I'm forgetting.

The SAS Management Console, by the way, is written in Java. The initial distribution will be under Windows only, but it does run on at least some Unix platforms, and will be distributed for those platforms after further testing.

SAS/Intrnet

There's not much new, but stored processes will be available as another way to run programs, there are various bug fixes, and load balancing will be better.

One change will make debugging easier: the variable list will be moved to the log before a broker program is executed, not after, so you will be able to see the input to programs which have gotten hung up.

Enterprise Guide

The current version (with SAS 8.2) of Enterprise Guide is 1.3. If you requested the limited availability distribution of 9.0, you would have received Enterprise Guide 2.0. And the version that will ship with SAS 9.1 will be Enterprise Guide 3.0.

Version 3 will have a different interface (I don't know how it will be different) and will be able to create (and, I assume, use) stored processes. It is being rewritten in C# - bad news for those who'd like to see it ported to a different platform.

Because it's staying on a Microsoft platform, there's no possibility of fulfilling one of my requests, which is to allow the shared repository to run on a Unix server rather than a Windows server.

My other request was the ability to use automation to set values in the Enterprise Guide Administrator (read and write server, binder and library information from a Visual Basic program) will be taken under consideration. If that's something you also want, please let your SAS sales rep and suggest@sas.com know.

Business Intelligence Suite

The Opening Session pushed the Business Intelligence Suite. It's a bundle consisting of ETL Studio, Web Report Studio, Report Studio, Office Integration, and several servers. I've already described ETL Studio and Office Integration.

Web Report Studio and Report Studio are interactive report designers. I didn't get a good look, so I can't say much about them, but the demo looked OK. Someone mentioned to me that Report Studio's ability to export to Excel is poor; again, they need to look at what competitive products have done and emulate them.

Before you can run a report, you have to describe all the data to the metadata server. This may be the most time-consuming part of the process for existing data (if the data came into your system through ETL Studio, they'd already be in the metadata server).

I told one of the consultants in the Demo Room that this seemed like a lot of setup for a simple report, and was told "Yes, but this product is aimed at corporations which will want ten or twenty thousand copies for analysts. It's not intended for programmers."

The ETL Studio is Java-based, so it will eventually be available on platforms other than Windows. Report Studio is Windows-only.

Pricing wasn't discussed, but the name includes three buzzwords, the bundle includes the equivalent of the Integration Technologies server, and it's aimed at large businesses, so it's not going to be cheap. I hope they will make the Report Studio and Office Integration available separately at a reasonable price.

This is probably as good a place as any to mention that the data cube builder (which looks reasonable) now understands star tables.

Futures Forum

The Futures Forum was less contentious than it often has been. In some ways it was also less informative; perhaps it takes hard questions to provoke good answers.

The answers to the questions below are sometimes combined with information I got in the Demo Room.

Why are so many products being released on Windows?
Java-based clients are being released on Windows first, but will go on other platforms later. Java turns out to be not-so-portable after all.
What about .NET?
Communications are platform-independent, so both can be supported. If you write a .NET app, it should be able to talk to SAS.
When will we get better encryption? It's only 31 bits now.
There are export restrictions and various other obstacles, but they're working on it. SAS 9.1 has SSL support. Disk encryption of data sets will require changes to the I/O routines, and probably won't be in 9.1

The SAS/Secure product can use the RSA encryption libraries, which are considered to be fairly secure. The SAS encryption method is fairly strong, but possibly not as secure as the RSA encryption.
Is there a place for government in SAS's future? Everything seems business-oriented. There's no supply chain in government, for example.
SAS thinks that business solutions work for government as well as the private sector.
Will the Java classes and components be public?
Not all of them, but some.
Can the enhanced editor be made more enhanced, like the Visual Basic editor for example?
No, they haven't pursued that.
How about an import wizard to create formats from SAS datasets?
It will be considered.

(What I'd like is an extension to PROC FORMAT allowing something like PROC FORMAT data=mydata START=start_var LABEL=label_var OTHER='bad value'; This could be used behind the scenes in the import wizard, and also directly in code.)
What is the future relationship between SAS and Open Code?
It's up to SAS to provide functionality that's worth paying for. No one in the industry knows what will happen.
What about SAS in multiple virtual sessions under IBM Z/OS?
Asked for a show of hands - how many people would buy it? No one raised their hand. That's the answer - SAS won't sell it if no one is going to buy it.

(I think I heard informally that it already runs under Z/Linux. Once they've gotten it to compile on one Linux system, getting it to compile on another isn't usually a big deal. The problem is testing and maintaining and documenting it. I heard a few years ago that the QC process for just one SAS/Access product on a new platform costs SAS about $100,000. The base product costs more. Half a dozen Z/Linux licenses won't give them a payback.)
Does AF have a future?
Yes. There are still lots of SAS employees supporting it. Lots of customers are using AF and building new applications. If you want to build a cross-platform app, you're better off doing it in AF than in Java.
What about DBMS/Copy?
(DBMS/Copy is a Windows product that can convert between SAS files and various other formats, including Excel and dBase. It was created by Conceptual Software, which was bought last year by DataFlux, a SAS subsidiary. There's also a data access engine product for use in SAS.)

The DBMS Suite is being marketed through DataFlux. Some code from DBMS/Copy has been added to the SAS/Access to PC File Formats for Unix product.

DBMS/Analyst, which could serve as a limited function replacement for base SAS, isn't "moving forward".
How easy will SAS be to install and maintain if you keep adding servers?
SAS thinks that everyone will be processing such large volumes of data that servers will be needed.

There will be a focus on making installation easy.

(See the sections on 9.1 and Installation above.)
What are the plans for improved access to SAS from third-party query tools?
They're working on a new server that will use industry standard interfaces, probably in SAS 9.2.

An application that uses ODBC or JDBC has access to SAS data.

In some cases, they're working on loading data into other systems.

(I didn't sense complete satisfaction with this answer. All the solutions are optional-at-extra-cost.)

SAS-L BOF

The official list of SAS-L BOF winners is:

The title of Most Prolific Poster, which for several years went to William Viergiver, has a new owner. The new champion is Ron Fehd, with 736 postings between March 15, 2002 and March 15, 2003. This averages out to more than three postings per work day; his high water mark was reached one day last April, with 21 postings in a single day!

Here's the Top 20 by userid:

UserIDCount
rjf2@CDC.GOV738
Cassell.David@EPAMAIL.EPA.GOV630
peter.crawford@DB.COM598
J.Groeneveld@ITGROUPS.COM542
WHITLOI1@WESTAT.COM493
JackHamilton@FIRSTHEALTH.COM419
stringplayer_2@YAHOO.COM350
Howard_Schreier@ITA.DOC.GOV344
radevenz@IX.NETCOM.COM335
ya.huang@PFIZER.COM316
ghellrieg@T-ONLINE.DE307
Charles_S_Patridge@PRODIGY.NET297
Paul.Dorfman@BCBSFL.COM252
KevinMyers@AUSTIN.RR.COM235
wwvierg@ATTGLOBAL.NET226
John.W@MEDISCIENCE.CO.UK219
diskin.dennis@KENDLE.COM219
paul_dorfman@HOTMAIL.COM193
HERMANS1@WESTAT.COM192
wielki@INED.FR186

If you combined Paul Dorfman's three known aliases, he would move up to 6th place with 488 messages, and combining Roland Rashleigh-Berry's two userids would give him 14th place with 282 messages.

I obtained these numbers from the list server at Marist using the SAS URL filename engine. The only number I've checked against another source is Ron's: Google Groups credited him with 744 messages, which is close. I will leave it to the statisticians to decide whether the difference is statistically significant.

Closing Session

The next few SUGI's:

Two new sections will be added next year, Analytics and Solutions.

Off Topic

Like at least one other attendee, I took the train to Seattle, and like at least three other attendees, I took the train back.

The train in question is Amtrak's Coast Starlight, which runs, or maybe crawls is a better word, from Los Angeles to Seattle. I was scheduled to get on the northbound train in Sacramento at 11:59pm and to arrive in Seattle the following day at 8:30pm. It arrived in Sacramento about 12:25am and left around 12:50am, almost an hour late. We didn't lose much time, but we never made it up, either, and arrived in Seattle about an hour late.

The scenery was very nice. Oregon, in particular, is very pretty, especially in the snow (and we went through a blizzard for a while). I had an economy bedroom, which is small but adequate. There was a first class parlour car where we had a tour guide explaining the sights, and a wine tasting in the afternoon.

I learned a new railroad term. Shortly after we left a small town in Oregon, the conductor announced over the intercom that we had a carry-by. About 15 minutes later, we stopped out in the middle of nowhere, and the southbound Coast Starlight stopped on the track next to us. After a few minutes, both trains started moving again. It turns out that a passenger had gotten on the wrong train, and they stopped to let him switch.

I stayed at the Hyatt, which was the best conference hotel I've been to. It had a huge bathroom, which was brightly lit, very unusual for a hotel (with a dimmer, so it didn't have to be bright). The bed was cormfortable, and the room was dark and quiet at night.

Also staying at the hotel, it turns out, were the stars of World Wrestling Entertaining, who were performing that weekend. They have lots of devoted fans. It was an odd experience walking out the hotel door to see crowds of people gazing at me. They didn't gaze for long, though - I guess I just don't look like a professional wrestler.

I ate one evening at Cutters Bayhouse, near the market. The food was good, and the waitress mentioned that her boyfriend is a railroad photographer. I didn't take any pictures on the trip, but you can see his pictures at:

The trip back was not as pleasing as the trip up. We left about half an hour late from Seattle, and as a result get getting further and further behind schedule (the host railroads are supposed to give Amtrak trains the right of way if they are no more than 20 minutes late; they don't have to if Amtrak is later than that, and Union Pacific, in particular, doesn't). By the time we got to central Oregon, we were four hours late, and we arrived into Sacramento six hours late. I wanted to be about three hours late, because we were scheduled to get in at 6am, way too early, but six hours was a bit much. There was no parlour car (it was broken or something) so we got a lounge car instead, and the attendant chose to play loud music all day and loud movies at night, making the lounge car uninhabitable for most of my waking hours.