Friday, March 11, 2011

meandering

I complained before that it's hard to write a program without a specific purpose in mind. I've been writing a file reader without knowing what I wanted to read, and what to do with what I've read after it's been read.

Fortunately (unfortunately?) the file I'm reading is efficient for writing, awkward for reading. Regardless of what I do with the data I still have to write a program that knows how to read any of it (or skip over it).

Assert assumptions:

1) The file header comes first and tells us how many channels (of various types) that there are.
2) Next are all the channel headers for one data type (followed by the other two types). We read in and store them, or skip over them.
3) Here is where we encounter actual data block headers. We read in the header, and in this header we see what kind of data follows it, and how much of that data exists.
3a) If we care about the data we read in (number of data * size of one unit of that data) the data.
3b) If we don't care about the data we skip (number of data * size of one unit of that data) past it.
4) EOF is reached and program ends the reading section.

This is somewhat irritating because you MUST read the whole file (storing or skipping the data as needed) and that's going to have consequences the larger the file gets.

There is another company that uses a database-heavy storage scheme that I think has the right idea.

Anyways, tonight was about understanding the steps needed to get from start to end for reading the data. Tomorrow is going to be about setting up some constants infrastructure. I want a macro for all the numbers I'll commonly use, like all the data sizes so that I can just say skip/read (numdata * sizeofdata) instead of (numdata * (sizeof(double)) or something similarly cryptic.

I wonder if I can call SQL from C....

1 comment:

  1. A few points:

    1) When you say "MUST read the whole file" has consequences you are correct, but you needn't be too concerned about them. fseek() allows you to efficiently ignore the large chunks of data you don't care about.

    Using a relational database might make your application code look cleaner, but that's because the 'storage parsing' work has been outsourced to the DB program.

    As you hinted at, "without a specific purpose in mind" it's hard to make these kinds of judgments.

    2) C should have libraries for SQL. Python definitely will have SQL libraries.

    ReplyDelete