| |
|
|
|
|
Read and write .csv files in RealBasicThe .csv file format is a universally understood one that has been in use for so long that it is hard to remember who actually created it and when. When data is saved in .csv format it is saved as plain text using commas and carriage returns to separate each item. It is so simple that almost all programs that are able to export and import information can do so in .csv format. This makes it perfect for exchanging information between different programs. Even if you are not trying to exchange data and just want to store information on the hard disk drive, it is often useful to save it in .csv format. Small and simple databases that store everything in memory can store the database on disk as a .csv file and there is no need for complex formats. The data is stored in a very simple and logical way, and it is quite compact too, with little wasted space. If you lose the original program you will always be able to access the original data because it can be loaded into a spreadsheet or a text editor and viewed or edited. Let's take a look at a simple example of a .csv file: Fruit,People,Places,Planets,Numbers Apple,John Smith,London,Mercury,12345 Banana,Bob Jones,New York,Venus,987654 Orange,Susan Brown,Paris,Saturn,-1 Lemon,Mary Moore,Moscow,Jupiter,3.142 There are five lines of data and five items on each line. Of course, it doesn't have to be 5 x 5 and I could have typed in more - or less. You can store as many items on each line as you want and the number of lines is equal to the amount of data you want to store. You could have 1,000 or even 10,000 lines if you wanted to and there aren't any limits. The example .csv file above is a simple database and there are four records. Each record has five fields and the name of each field is stored in the first line. This is optional and some .csv files have the field names in the first line and some don't. It's up to you whether you want to store the field names or not. Writing .csv files You will often want to save the information stored in a ListBox to disk and then load it again the next time the program is run, so this is what is used in the example program. If you want to save the contents of an array and load it back again, it is quite easy to modify the program code The program can be downloaded from here and loaded into Realbasic so you can compile it and run it yourself. The only code that is important is the code on the Load and Save buttons. The rest is trivial and you can examine the program yourself. So let's look at the code on the Save button first. This writes the data in the ListBox grid to the hard disk drive.
First we show the standard file selector and get the name of the file to save to. Of course, you could set the filename in code, but in this example the user is asked where to save the file and what to call it: f = GetSaveFolderItem("text/plain","Test.csv")
Now the file is created and it's a plain text file and a text output stream is used: tos = f.CreateTextFile You'll see some error checking in the code, for example, you need to test whether the user clicked the Cancel button when saving the file or whether the file could not be created for some reason. The next bit of code, the bit that writes the .csv data, is wrapped in an if...then in order to check that there is actually some data in the grid to save! if ListBox1.ListCount>0 then 'if grid not empty
for i=0 to ListBox1.ListCount-1 'for each row
s="" 'build line to save
for j=0 to ListBox1.ColumnCount-1 'for each column
s=s+ListBox1.Cell(i,j)+ "," 'csv
next
tos.WriteLine s.left(s.len-1) 'save line
next
end if
It's really simple and all that you need to do is to build the line that is to be saved in a string. The string variable s is set to "" (nothing) and then each item of data in the current tow is added - with a comma in between each one. The string is written to disk when all the data for the row has been added. That's it. The data is stored in a .csv file. Reading .csv files
The first task is to show a standard file selector and ask the user to select the file that the data was saved to. Of course, this step is optional and you could write the file name and path in the code to allow the program to load and save data without the user being asked: f = GetOpenFolderItem("any")
Eagle eyed readers will notice that the file was saved as a 'text/plain' file type, but now the 'any' type is used. This is because other applications can take over control of a file and change it's type. When this happens you may find it cannot be opened as a text/plain file. For exmple, opening the .csv file in TextEdit on my Apple Mac, adding or editing data and then saving it again turns it into a NeoOffice document! Double click the .csv file and the spreadsheet imports it! Weird! The file is opened as a text file: tis = f.OpenAsTextFile Then we can read the contents of the file. One line is read at a time and the RealBasic Split(s,",") function is used to place each item or field into the fields() array. This chops the string at each comma and places each chunk into a separate element of the fieds() array. It's really useful.
while not tis.EOF 'while not end-of-file
ListBox1.AddRow "" 'add row to grid
s=tis.ReadLine 'read line from file
fields=Split(s,",") 'put items in fileds() array
for i=0 to ListBox1.ColumnCount-1 'copy to grid
ListBox1.Cell(ListBox1.ListCount-1,i)=Trim(fields(i))
next
wend
Quoted strings The way that .csv files overcome this problem is by enclosing strings like this in quotes, so the address would be stored as "123, Fleet Street, London". When it is read back in from disk, the quotes should be removed and the whole string treated as one field. There are two ways of overcoming the comma-in-fields problem and you could write your own function to replace RealBasic's Split(). It would need to chop the line read from disk into chunks and store them in the fields() array. I'm not going to do that though, I'm going to cheat! Originally in the save code we built up the string to save like this: s=s+ListBox1.Cell(i,j)+"," In order to cope with commas in the data, we can simply replace them with some other character - something that isn't normally used, like this: s=s+ReplaceAll(ListBox1.Cell(i,j),",","|")+"," In this case we replace every comma with a vertical bar character - |. You could use any character of course, and I use | because it's not part of the alphabet or punctuation, so people don't normally type it in. When the file is opened for reading all we have to do is to replace the | with commas again. So in the code the read the data back in we replace this line: ListBox1.Cell(ListBox1.ListCount-1,i)=Trim(fields(i)) with this one: ListBox1.Cell(ListBox1.ListCount-1,i)=Trim(ReplaceAll(fields(i),"|",",")) This would make our .csv file non-standard, but if you are only reading and writing your own data files, this doesn't matter. Whether you cheat and replace commas on writing and reading or you create your own Split() function is up to you.
|