Saturday, February 06, 2010

SaviDataSet Alpha 1.0 On the Web

I am finally ready with my SAS dataset reader/writer for .NET. It is written in 100% managed code using .NET 3.5. The dlls can be found
here.

A sample .NET console application can be found in the program file entries after the installation.


Update 2/13/2010:


I am now testing this against a real-world project so I am finding little bugs here and there. These are being addressed for version 1.0.0.1. If you need a build sooner than it is released, let me know.

I have also found that you need at around 100+ observations for this to work correctly. I am investigating but keep that in mind while testing.

[LATE BREAKING] I fixed the obs issue so I went ahead and uploaded the fixed version. Watch for breaking changes which are in the Readme.txt file.

Update 2/29/2010

Based upon Chris' work, I added in a console application that exposed my existing work. It allows for export of the sas7bdat to Excel, delimited, and XML. I will add in import at some point and have designed the interface as such. For example, I put the parms in an XML file to provide enough flexibility to accomplish everything. That has pros and cons but I figured the pros outweighed the cons.

I am also doing something I have meant to do for a long time which is to build a data viewer. Hence, I don't want an interim release but will bundle it all together. I have spent some time this weekend starting that process. All of the components are here but I have to pull them together which I am hoping to do this week.

28 comments:

Toni said...

Hi Alan
This weekend I tried to use your SaviDataSet to read in a SAS dataset with the ReadSasDataSet(string filename) function. It was from an ASP.NET MVC project and because of potential size limitation in what is possible to read from a web project I made a 5 kb sas dataset with only two variables. When I ran the project i did not recieve any errors but nothing happened and when I looked at the SAS dataset via the debugger I saw a "Cannot access a closed file" exception. Have you got any ideas about what could be causing this?
KR Toni

Toni said...

Just read your comment and wanted to add that I also tried with a 1 mb SAS dataset with approx. 1100 records but I got the same exception.

Alan Churchill said...

Toni,

Send me the dataset and I will see what is happening. This is a very complex area and I may have a field off. Best I can come up with now but it is the truth.

My main focus over the last 6-9 months has been on the write so the read may have missed a step on the code changes.

Alan

Toni said...

Okay, I will send you both SAS datasets that I tried reading and maybe include a more detailed description of the exception I got when running the ReadSasDataSet. I just wanted to add that I really like prog in .NET (C#) and in my job I work primarely with SAS so I find this really cool. I really see potential in combining the two technologies and a SAS dataset reader/writer is essential. So keep up the good work.

Alan Churchill said...

Toni,

Thanks for sending me the datasets. End users that tell me where issues lie and help id them are a gift.

Doing this work was a labor of love. I started with 2 extras who dropped out soon after it began and then wasted nights and weekends on it with no hope of success. Then a glimmer would happen and back to it.

I am a C# freak. I have even started coding in 4.0. Coding in Base SAS is not my preference after 20+ years of doing it.

The version you have will terminate on June 30th, fyi. I work on getting a better driver out there.

A new enhancement coming will be to do the conversions to-from Excel, Access, delimited, XML, etc.

Alan

Chris Long said...

Hi Alan,

I'm still working on a reader myself, making good progress over the last few weeks. I'll be releasing a command-line reader utility in the next week or two.

Excellent work on the writer, I'll try it out when I figure out how to work with it in my ancient IDE :-)

Chris.

Alan Churchill said...

Toni,

I just tested the read again and did not see any issues. Can you send me the dataset and the code you used so I can validate the error?

Alan Churchill said...

Chris,

Congratulations on getting close on a reader. The writer is far, far more work and it stunned me to know how much extra effort it took (double, perhaps, maybe more). I'll take a look at it once you are ready.

Look at my object model and see if that helps you with yours.

Toni said...

Had some time to test it again today both in ASP.NET MVC and in a Console application. And the dataset object has a FileStream which throws a System.ObjectDisposedException. Cant figure out if I am doing something wrong..

What Email address can I send the SAS datasets to? The C# code is as in you example.. first create a new SaviDataSet object and the use the ReadSasDataSet function with an appropriate path.

KR Toni

Alan Churchill said...

Toni,

You can reach me at alan.churchill@savian.net

Alan Churchill said...

Toni,

I did find an issue with the read but not the one you saw. I would still like to see your code since I think it is code related.

I am working on a new build and should have something up in the next day or so. I am also adding in some features such as SaveToDelimited, SaveToExcel, SaveToXML, etc.

Kenneth Yan said...

I will tell the dll tonight. Thank you for your excellent work, Alan.

Alan Churchill said...

Kennth,

It is having issues on some files. Toni has provided m those 2 files and I have 1 working but the other one is still causing a problem. Fortunately, I am pretty sure I know where the problem lies. I will be working on it tomorrow to try and get a new build out. I had to divert to some customer work so it was delayed a few days.

Chris Long said...

My reader utility, dsread, is now available at http://www.oview.co.uk/dsread.

Thanks, Alan, for you support while we were first working out the details of this format.

Alan Churchill said...

Congrats Chris. I will check it out a bit later after I fix the bug in my own reader ;-]

Kenneth Yan said...

Alan,
The writer works well. And I haven't test the reader too much.
There is one issue of Sample Code.
//Export functions
ds.SaveAsDelimited(@"c:\temp\temp.txt","\t", true);
Since the static class Delimited was added, this line should be changed to "Delimited.SaveAsDelimited(ds, @"c:\temp\temp.txt","\t", true);".

Alan Churchill said...

Actually, Kenneth, the sample is correct. It is an extension method.

Alan Churchill said...

I just uploaded a new version that should fix the issues that Toni saw on the read and also corrects some issues with the export support. I expect another version soon that will add support for OleDb and Excel. I will also be working on importing from the formats rather than just saving to them. Thanks for the help and suggestions guys.

Vlad said...

Hello Alan

Great job!

I'm just started studying SAS itself and trying to make it work with my .Net program. One question about your SavianDataset - is there in your plans to make writer work with streaming data? Those, where you don't know the dataset size ahead of time. I've looked at the interface and found nothing about that.

Thanks,
Vlad

Alan Churchill said...

Vlad,

You can simply add observations to it as tey are encountered. The dataset size does not need to be specified ahead of time.

I may have missed something though. Please clarify.

Vlad said...

Yes, with your interface I can simply add observations, but in my case there are might be a very huge amount of records. As I understand, SasDataset collects all the observations and then writes them at once. They can eat too much memory, can't they?

Vlad

Alan Churchill said...

Vlad,

You are correct. There needs to be a buffering mechanism so that memory is not the only store. For now, it is designed for testing and to find issues. In the real world, it needs to handle millions/billions of possible records.

Alan Churchill said...

I am also open to the Add method. If someone wants a better way to handle the Adds, let me know.

Vlad said...

As I understand - having exact information about number of records makes generation of result SAS dataset easier, because there is first data block that contains all that information.

With the streaming data I see two ways:
1. Buffer it somewhere: memory or disk
2. Write records as they come and update first header after stream finished.

In my case I do the second, which has some drawbacks and coding difficulties, of course, but allows me to save memory and disk space.

The 1-st way is to buffer the whole stream into file and then read out of it. Maybe this would be the easiest way to implement it in your SasDataset.

Alan Churchill said...

Vlad,

I think the second way is a better option. I don't want to use an Observation array and force it to be defined up front. Instead a generic list is better and allow it to be dynamic then, upon the dataset being finalized, update the metadata.

Let me look at the code after I get my console interface done as well as make some progress on my dataset viewer.

Vlad Ogay said...

Hi Alan,

It's me again :)

Just grabbed your binaries and tried a simple test:

using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using System.Text;
using Savian.SaviDataSet;

namespace Test
{
class Program
{
private int max = 500;
static void Main(string[] args)
{
//var m = new Savian.SaviDataSet.Main();
var dt = new DataTable("Phonebook");
dt.Columns.Add("ID").ExtendedProperties["Label"] = "ID";
dt.Columns.Add("Name").ExtendedProperties["Label"] = "Name";
dt.Columns.Add("Phone").ExtendedProperties["Label"] = "Phone";
for (int i = 0; i < 500; i++)
{
dt.Rows.Add(new object[] { i, "Vlad" + i, "+74433343" });
}
var sasDs = Utilities.ConvertNetDataTableToSas(dt, "SasPhonebook", "");
sasDs.Save("c:\\temp\\phbook.sas7bdat");
//Savian.SaviDataSet.Main.Test1();
}
}
}


The output file shows an error when I'm trying to open it in SasViewer. Have any idea why it could be?

Noel said...

The download link does not work. It takes me to a Skydrive account and the file is not found.

Alan Churchill said...

It is no longer applicable. The code is way out of date for the Alpha version.

- Alan

CTRL+Z does not generate EOF in Windows 10

In Windows 10, when I was trying to generate an EOF for a Java program, the CTRL+Z did not work. After doing some research (and help from f...