Sunday, April 29, 2012

The Next Step

The two year experiment is over.

Did I learn C?

I think so. Maybe. It's complicated. In hindsight it was foolish to set out learning C. I should have started out simply learning to program with C as the medium for doing so. I spent way too much time getting involved in the particulars of C programming. I don't regret doing so, but I've found that programming is not a matter of language only.

Did I learn to program?

Yes, but I have a long way still to go. I'm at a point where the language isn't in my way anymore. Any language. For example, I'm currently struggling with finding a way to talk to my Arduino over the serial port using C. There's already a Python module for doing this that knocks out the hard part of that process in a few lines. I want to do it in C because I want to learn the internals, but it's not C that's giving me fits - it's the UNIX API for serial communication, termios. It's just a matter of reading the documentation and tutorials while bringing it all home with some test programs. Again, it's not the language I struggle with, it's reading other people's code and plowing through documentation. I have a feeling like that struggle is what programmers at any level do a lot of!

I have a dozen projects to tackle, two dozen if I count non-programming projects.

To that end I've started a new online journal at www.chrisheydrick.com so that I can have a place to document all of my hobbies. That's where I'll be from now on. I've been a Blogger user since before it was bought out by Google. I'm going to try Wordpress out for a while just for a change of pace. I might be back, who knows.






Wednesday, April 11, 2012

Work stuff will be dominating my time for the next few weeks.

The refactoring of the case reporting library was successful. The whole project was successful, actually - it proved that more elaborate case reporting and analysis tools were needed, and now I'm looking into commercially available options. I guess once the higher-ups got a taste of what's possible it opened up the option to spending money. I'm more than happy to put this project behind me. It was a valuable learning experience, but extremely dull.

Now I'm looking towards what's next. I started this journal to keep a record of my progress in learning to program in C, but what wound up happening is that I laid a foundation for learning to program in anything for whatever reason.

I've spent some of this evening researching microcontrollers. There has been several times in the past few years where I've almost purchased an Arduino, but I didn't go through with it because I was worried that it was too high-level to be a good taste of real microcontroller programming. Now that I have a better understanding of the tool chain (as well as what being an Arduino really means) I realize that's not necessarily the case, so I might pick one up soon.

I started reading Charles Petzold's Code. It's absolutely brilliant. I picked it up because I wanted to learn more about assembly programming (specifically for the fictional DCPU-16 that will be featured in Notch's Ox10c game). The book starts with things like Morse code and Braille, telegraph machines, and relays. It then bridges those concepts with (I believe it's called) set theory, and then on to combinatorial logic and gates (AND, OR, NOT, XOR, etc). That's where I am right now - right at the part where all these gates are combined to make half and full-adders. It goes into a decent amount of detail about how to use these to add and subtract binary numbers, but I'm afraid I'm getting a bit lost with the subtraction and I'll need to work it out on paper before moving on. I think that soon the book will transition to flip-flops, memory, and then eventually how it all combines into something that can be programmed.

I have a good idea for a next project to stretch myself, but I need to plow through more of the Objective-C book first.

Wednesday, March 28, 2012

Refactored case report program

I decided to just bite the bullet and reorganize the program I'm writing to make some reporting easier at work. Python project organization is a pain.

http://stackoverflow.com/questions/391879/organising-my-python-project

http://stackoverflow.com/questions/1642975/folder-and-file-organization-for-python-development

http://www.python.org/doc/essays/packages.html

I think I have it figured out, and some of the problems I had with importing my modules last night might have been operating system specific. I'm doing the development in OSX since I don't want to boot into my Linux VM every time I want to bang out an idea.

I'm using Excel for 90% of the reporting right now due to time constraints. I'd LOVE to do it 100% in Python so that I can do much more elaborate and flexible reporting. The last 10% is yanking out some numbers based off how long a case takes to close so that I can make a histogram. I spent way too long trying to get Excel to do that and failed. It's going to be much easier using Python, but I'm making it harder than it needs to be so that I can set up a solid library of functions I want to use in the future.

Monday, March 19, 2012

adding PE projects

I've been working to consolidate my Project Euler solutions into one big program, and so far using Git and GitHub have been working out splendidly. I added the first 4 projects, and now I'm working to add the next 6 or 7 which I have scattered between several computers.

Luckily the next several are on my Macbook, which has Git and GCC and is completely identical to the Linux box (at the level I'm working with it anyways). So, I have been trying to find a Git workflow that allows me to add and commit to the remote repo from OS X, and put the finishing touches in Linux.

Pre-step (did once): Clone the master git repository.

Step 1: Find file in OS X, verify it works in the original XCode project I made it in.
Step 2: Go to the local OS X repo directory and run "git pull", which gets all the latest changes I have made
Step 3: Copy over the existing PE solution to the local repo directory, and rename it appropriately
Step 4: Add, commit, then push to the remote repo
Step 5: Go to the local Linux repo directory and run "git pull"
Step 6: Make changes, add, commit, push.

Now what I've outlined is a very simple case. Typically I make a new branch on the OS X side, push that branch to the repo, pull that branch (with "git fetch") in Linux then merge to the master after I make changes. I'm still figuring out when to use "git pull" and when to use "git fetch". I think the work I'm doing is so trivial it doesn't matter.

Wednesday, March 14, 2012

A reminder about character array sizes

char *charpointer = "iameight\0";
char array[9] = "iameight\0";

The strlen size of charpointer is: 8
The strlen size of array is: 8

The sizeof size of charpointer is: 4
The sizeof size of array is: 9

The above is the result of a small "sanity check" program I keep around. It's super helpful.

grip gotten

Ok, I think I have a handle on basic Git (branching, merging, fetching from remote repos into a branch, pushing a remote branch, pull requests).

I'm somewhat back into C mode now. I should dust off some old forgotten projects, or maybe take the time to hack on some GTK stuff again.

Tuesday, March 13, 2012

Getting a grip on Git

Ok, so I know how to edit and add files to my local repository, and then push them to GitHub.

Matt forked my project. What can I do with this? I tried to add a new remote repo, create a branch, switch to that branch, and then pull down what he's done, but it didn't really work how I thought it would. I figured it would basically remove what I had existing and then put what he did, but it tried to merge the two and did a bad job of it.

My intention is to simply run things exactly how he is to try it out. What do I do?

I guess I'll be doing more studying later. Oof. I'm having a horrible time searching for what to do.

Monday, March 12, 2012

Still programming

I'm still learning. I didn't stop.

At some point it started to feel like work (some of it is actually related to my job) and I don't feel like writing anything about what is essentially boring stuff.

My nights and weekends have been spent on the case management and analysis project I'm doing in Python for my day job. I can't justify working on it during normal work hours since I'm too busy, and technically it's not something I'm supposed to be doing (although nobody has complained about the results so far). I could have knocked it out weeks ago but I'm taking my time to make sure I do it as pro as possible. For example, I'm trying to gear it towards a proper distutils installable module, even though it wouldn't ever be used that way.

What's really irritating is that I've learned so much over the past few weeks that I want to scrap the whole thing and start over again. I realize thought this is a dangerous road, so I've written out exactly what the first version needs to accomplish to be functional, and once that's in place and running I'll branch the code and do a massive refactor. That's another thing - I'm trying my best to use Git properly. I put a lot of time over the weekend into reading Pro Git, which has so far been either really helpful or really confusing.

Right now I'm almost done with the first phase of the first version of the program. Phase one is implement a robust class for loading and filtering the case data, and a class for pulling out numbers from the data. For example I'd load a case list, pare it down to a specific date range and a specific territory, and then count how many cases there are in that filtered list, or count how long the cases took to close, count how many unique accounts there were - stuff like that. Phase two is to use that functionality for generating reports in the form of graphs and HTML (to be then converted to PDF or straight to email).

My hope for version 2 is to implement a proper MVC design pattern with some sort of preview for the output prior to exporting.

Let's call version 3 a proper application for not only viewing cases but for on the fly filtering and adding. By that point I'll probably have to start using a SQL database, but hopefully I'll be ready for that by then.

I know - I've said it before that this isn't ChrisLearnsPython, but I'm having a good time with it and it's super hot right now. I complained a lot at first that my hinderance with learning more C was finding projects to tackle. I can't STOP thinking of projects to do in Python.

Once I do get back to C it probably won't be vanilla C, I'll likely keep on with my Objective-C learning so that I can tinker more with Cocoa and iOS.

Friday, March 9, 2012

Off Topic: Video Conferencing

I've been using video communication tools since the late 90's. The first webcam I ever used (probably in the mid-90's) was black and white and connected with a parallel port interface. I only had one friend with one, so it wasn't all that useful. By 2000 I was doing it on a regular basis with friends, using Netmeeting or other tools - I forget what they were all called. I think AIM and ICQ had some sort of video features.

Anyways, the war was waged and Skype is the winner (on Windows, anyways).

I've used it in my personal life quite a bit. Not so much anymore since the wife moved in, but since my parents are overseas I still use it every so often. As a personal communication tool I find it a mandatory install on all my machines, and will for a long time.

As a professional tool, however... I'm undecided. We have a paid account in the office that lets us do multiple video chats at once, and when it works well it really makes a difference. However, when it misbehaves it tanks the productivity of the meeting.

Here are the reasons I think the technology is not quite there yet:

1) Interface. There needs to be a full screen UI that assumes you're viewing the screen from far away, like in a conference room. I'm fairly young and even I can't see text that well from fifteen feet away. It often turns out the person closest to the front of the room controls the system.

2) Problems compound when more people connect in. If someone cuts out in the chat it might not be cutting out for everyone. One on one it's easy to tell who's having problems. Two people joining in and everyone has to confirm with two people that they're audible/visible. Three people... four people... etc. The first ten minutes is everyone asking each other "can you hear me?".

3) Feedback. I've imposed an insistence that everyone wear headphones so to avoid squealing feedback. Laptops are the worst offenders since the speakers are too close to the mic.

4) Not knowing who's available. I wish Skype was smart enough to do some face detection to set the here/away indicator when someone is actually in front of the computer.

Is video even that great of a leap past audio for your average meeting? The quality isn't so great that I could hold up a small device and anyone could discern details. I might be too socially dim to care that much about seeing someone's face.

Anyways, just some off topic thoughts that run through my head several times a week.


Sunday, March 4, 2012

Project Euler progress on GitHub

I'm unifying my Project Euler progress, and I'm going to use this opportunity to learn more about Git and GitHub.

I thought a good deal on where to do the actual work (Windows, OS X, Linux) and settled on Linux since Windows has a bizarre Git implementation, and OS X would let me cheat by using a front end for Git and keep me away from the command line (which I need to be stronger at).

Git is sort of easy to learn. I feel like I'm doing things the hard way sometimes. I had assumed it would just "know" when I add a new file to the project, for example, but it didn't. I have to really be mindful of what I'm doing (in hindsight an obvious sentiment).

https://github.com/cheydrick/Project-Euler

The first go at it is just the foundation. I'll make functional the ability to choose which project to run at the command line next, and then continue adding the projects I have already finished. I'm looking forward to this, and I hope I'll have the chance to learn more about Git and proper source control practices. Maybe I can get someone to fork my repository and make a change so that I can learn how to merge changes into the master branch (or is it clone and pull... woof with the terminology).

Thursday, March 1, 2012

Biking

I haven't been programming much this week. Work has been very busy and I'm too tired to do anything other than mindlessly tool around on the internet.

My wife has got me riding my bike with her. We've gone out a few late afternoons this week. Sore, but getting better. I'm happiest when I'm improving myself somehow. So, I don't feel bad about not programming since I'm doing something to improve myself.

Thursday, February 23, 2012

Python class constructor polymorphism

C++ has this neat class functionality where it will choose the correct class constructor based off what arguments are passed during initialization of the class. Python seems to lack that. I want to be able to pass either a string to the class constructor method OR a reference to an already opened file. It would make my class so much more functional. It seems I'm forced to make one class for each scenario. Well, maybe I could do some type detection and make some decisions in the class constructor based of what is passed, but that's.... hacky?

Also, when did I start getting to a point where I could legit complain about Python OOP features?

Good blog post on *args sand **kwargs

http://www.saltycrane.com/blog/2008/01/how-to-use-args-and-kwargs-in-python/

I haven't found myself needing variable argument lists in functions, but I see it frequently in other people's code. Variable function argument lengths in C is something I understand on a superficial level but have never needed to use myself - but I almost never see it in other code I read.

Monday, February 20, 2012

Document

I blocked off some time at work to knock out some more of the tool I'm building for extracting/analyzing support cases. It's been about a week since I had last dug into it so it was tough to figure out where I had left off. Since I didn't have much time before I had to be home I decided to abandon coding and start documenting. I wrote out the general reason for the program, some sample usage of the main class and its instance variables and methods, and documented the purpose of the methods I had already written.

Honestly it was a weird form of procrastination, but now I know exactly what's done and this is a good pointer for where I need to go. There was a link on Reddit I wish I had bookmarked about programming "backwards" where you write the documentation first - even writing out sample implementations of APIs not written yet. It provides a sort of design document and gives direction of what the actual implementations need to do.

I'll probably write about this more later (actually maybe I already wrote about it...) in a specific sense. The largest struggle I've had with this program is how to store results. Should the class members return data internally, or to an outside variable? My experience with C has driven me to the latter. Most functions return the value you're asking it to compute. Sometimes the function will work with pointers to outside variables and technically return void (or 1 or 0 or whatever means success) but it still involves outside variables. Once you start working with classes you have to somewhat abandon those rules and try to stay within the class. I am always repeating this: "A class is where you keep data and the functions that can alter that data". A natural extension of that is "data in the class that you modify will also be in that class".

I struggle with object oriented programming. This must be something a CS major puts a great deal of time into, and maybe that's why Java is a dominant language in CS programs.

Thursday, February 16, 2012

Reading bytes in Python

Today's lesson in why you should RTFM, but more on that later.

One of the biggest successes I had while learning C was reading text and binary data from files. I'm going to tackle the same task in Python. The first step is going to be reading in some data (the first 136 bytes) of the file header.

The file I'm reading is the same sort of file I learned to read in C - a file with some header data with various things like the date it was acquired and some comments, and then large chunks of digitized analog data (like a .wav file in a way).

I'm starting small - I only want to read in the first 136 bytes of the file. The first 4 bytes represent an integer that is always in the file (sort of a marker that tells us where it came from). The next 4 bytes is the version of the file format (also an integer), and the next 128 bytes represent a string of 128 characters (which are 1 byte each, so 128 characters).

I've spent a good deal of time prepping for this task - most of the info I needed was in the documentation for file objects and the struct module. In short, I'm going to read in a known number of bytes using the read() method for file objects, and then "unpack" those bytes into a specific format (integers and character strings) using the unpack() function in the struct module. So, here's a start:



I once read somewhere that using a class full of empty instance (self) variables is a good way to mimic how structures in C look. I don't know if that's very "Pythonic" but it works for me. In the PHeader class I've defined three variables that I'm going to fill in with data from the file. The actual code that executes is under "if __name__ == '__main__'", which is just a fancy Python way of saying "if this .py file is run on its own then do what's underneath".

First I open the file as "p", and initialize "s" as an instance of PHeader. I know that p.read(N) will read in N bytes of the file, so I need to somehow tell Python to interpret those four bytes as an integer (as opposed to another data type that is 4 bytes) and then make s.MagicNumber equal that resulting number.

So that's where the struct module's unpack() function comes in. unpack() has this prototype: unpack(fmt, string). fmt is the format of the bytes being read (we want an integer so we pass it "i") and string is the bytes to unpack. Well, the result of p.read(4) is our string, so this line...

s.MagicNumber = unpack('i', (p.read(4)))

...gets our four bytes, interprets that as an integer, and passes the result to s.MagicNumber. A big caveat that I missed while writing this that caused a great deal of confusing is that it doesn't ACTUALLY pass JUST the integer. It passes a Python data type called a tuple with the integer I wanted as the first element of that tuple. Tuples (and other Python data types) work a lot like arrays in other languages - but more on that in a second.

The next line does pretty much the same thing...

s.Version = unpack('i', (p.read(4)))

Ok cool, we have now read two integers from the file and stored them in some variables. This next part is tricky and caused a lot of wailing and gnashing of teeth on my part. The struct documentation told me that "i" is the format character for integers, and "s" is the format character for character arrays (strings). Since the next 128 bytes of the file is a row of 128 characters (a string) I figured I could just replace the "i" with an "s" and then do p.read(128). This was incorrect. After a lot of pondering over error messages I carefully read through the struct module documentation and found that you have to precede the "s" with the number of characters to be read, like "128s". So that resulted in this line...

s.Comment = unpack('128s', (p.read(128)))

...and all was well.

Remember I said that unpack() returns a tuple, and in our case the first element of that tuple is that actual integer or character array we asked for from the file? Getting the first element out of a tuple is a lot like getting the first element out of a C array. If I have a tuple called MyTuple I can get the first element by asking for MyTuple[0]. So the print lines...


 
print('Magic Number: %s') % hex(s.MagicNumber[0])
print('Version: %d') % s.Version[0]
print('Comment: %s') % s.Comment[0]

...do exactly that. Oh - the first line says hex(s.MagicNumber[0]) because I want the integer returned to be printed out as a hexadecimal number.

All said and done that dozen lines of code took about an hour, which isn't bad considering I started out with only a superficial knowledge of how to read bytes. Hopefully the next step of reading the more important data from the file won't be so traumatic now.



Tuesday, February 7, 2012

Off Topic: Gaming

I remember the first video game I played only vaguely - some text adventure. The strong memories are of DOS games like Bumpy and Continuum that we had on the 286 my dad brought home. I didn't know back then that having a computer was anything special. It was just a natural thing, and learning to use to use it came naturally. I must have read the GW-BASIC manual cover to cover a dozen times (not understanding most of it, naturally). One day my dad brought home a laptop - the first I had ever seen in person. It was massive and even for the time slow, but somehow it ran a demo of Spear of Destiny - the first 3d first person shooter I had ever seen, and to this day the most magical gaming experience I've ever had. Ever since then I've been chasing that dragon. Doom, Quake, Unreal, Half-Life, all of them. Love them. Can't get enough. The games I've fallen in love with have been pretty typical of most gamers. Quake, Half-life, Bioshock, and whatnot. All still hold a special place, and all I still play through at least once a year. I just finished my yearly run of HL2 and Portal.

MMOs are a different beast. My first MMO was Ultima Online, but not in the traditional sense. By the time I got into it a few folks had figured out how to emulate the servers, so I cut my teeth on The Alter Realm UO shard, which ran Sphere Server. I played that for years. I still jump into UO from time to time, but not on TAR since that went belly up about seven years ago. In Por Ylem is pretty good, but again - chasing the dragon.

Since UO I've tried (the official versions of) Dark Age of Camelot, World of Warcraft, EVE, Anarchy Online, Age of Conan, and Star Wars: Galaxies. I played Galaxies for about three years and loved it - the best crafting of any MMO to date. It's a damned shame what happened to it, and I'll never trust SOE with a game again. Jeez, I'm still bitter about that.

I recently started playing Star Wars: The Old Republic. I liked the Knights of the Old Republic series. Good stuff all around. It's translated to MMO format fairly well, but I have some complaints (as per tradition of all MMO players). First is the crafting. It's asinine. You just gather resources and click a recipe to craft. There is no personal influence at all, and every Widget A you craft is identical in every way to everyone else's Widget A. Yes, you can sometimes roll a 20 and make a Superior Widget A, but it's still the same as everyone else. The second is that this game has somehow driven people to hit the max level as soon as humanely possible. Everyone is either level 12 or level 50. This worries me because I'm not leveling very fast (I'm a filthy casual player) and all the patches are heavy on the end-game content that I won't see for a long time. Third is that so much of the missions are the same old "kill X space-rats". But that's MMOs for you. I'm enjoying the attention to detail in the worlds. Balmorra has wrecked ships sprinkled in the landscape. It's quite fascinating. Will I keep it up after a few months? It depends on how social I get. Right now I'm playing very solo due to not having a regular schedule. I might roll an alt to play some different storyline, but unless some major reworks to the crafting, space, and in-game market are in the pipeline I'll lose interest by the summer.

I'm not sure how I started this post. I think I've been irritated at myself for owning a lot of games I haven't played all the way through. I need to finish Skyrim. I loved Oblivion. Skyrim is fantastic, but it's turning into a chore. Battlefield 3 was a day 1 purchase, but it's the first game I've ever played that's made me feel old. I just can't compete with the current generation of gamers. Mass Effect was good. I'm about 90% through ME2, just gotta buckle down and knock out the last few parts. Same with Bioshock 2. I don't play Team Fortress 2 anymore - too many new things to keep up with.

The last game I really sank into was Deus Ex: Human Revolution. I likely won't play through it again, but it was a good time.

Ok, I'm going to bed. I could write about games all night.


Monday, February 6, 2012

First useful OOP

I finally made a functional, useful class.



I've made functional classes before, but they were only for show. This is the first time I've used OOP that actually makes programming the rest of the application easier and more organized. I'm not saying I'm a believer in OOP for the sake of OOP, but it works for me here.

Creating a new instance of this class is easy - you have to pass it a legit .csv file (no checks yet, those are coming) and on creation it chucks the data in the .csv to a list, where each list element is a type dict with key/values that match the columns/values per row in the .csv. This alone took me for freaking ever to make work because it took me for freaking ever to get my head around these Python types.

Almost every method in the Cases class is about whittling down the internally stored list of cases. Calling CasesBySalesman(salesman) alters the internally stored list to just those by a particular salesman.

I labored heavily over how to do this in the most efficient way possible. The first incarnation of this class had each method return a new list, leaving the original untouched. This, I felt, left too much work to the program that would be using this class to handle. My knowledge of the "spirit" of OOP is limited since my experience is limited, but my favorite definition so far of a class is "data, and methods to perform actions on that data", so I went with that. There is a convenient reset method to go back to baseline if necessary.

So, in the eventual program that will use this class it would be as simple as:

A = Cases(file.csv)
A.CasesBySalesman("Bob")
A.OpenCases()

And A.caselist would be a list of dicts with the open cases for Bob. This seems very straightforward. My first pass mentioned above would have involved something like:

A = Cases(file.csv)
allcases = A.AllCases()
casesbybob = A.CasesBySalesman(allcases, "Bob")
casesbybobopen = A.OpenCases(casesbybobopen)

So yea, too many variables, and here I've reduced my Cases class to just a fancy holder of functions, instead of encapsulated methods to perform actions on internal data.

I think I made the right choice, and I'm quite happy with it.

EDIT: There is a copy/paste problem in the code above, but it's not important.

Clever way of reading data into a dict type

https://github.com/breuderink/eegtools/blob/master/eegtools/io/edfplus.py

I like what's in the edf_header() function. I'm going to experiment with doing that.

Every time I noodle through GitHub I learn something new.

EDIT: The BaseEDFReader class has some justification for something I had to do to make something work - you pass the class init method a file and then it's assigning that file name to a self variable. I was struggling with why it was necessary for my stuff to work so this is evidence it's a thing you're supposed to do... for some reason.

Friday, February 3, 2012

Gist test

GitHub has this thing called "gist" that is sort of like Pastebin:



Might use this if it behaves better than Pastebin.

Push

I cleaned up my GitHub account and made a repository for my hacked together but very functional security camera program.

https://github.com/cheydrick/Security-Camera

I'll be making a few changes to it soon, and I'd like to take this opportunity to get to know Git (and GitHub) better. The first change is to make it more camera and save location agnostic. Those options should be set via command line. Even better I'd like it to auto-locate a local Dropbox folder as a default if no explicit save location is made.

I got a new computer at work so I took the time to proper set up a Python development environment using Eclipse and the PyDev plugin. I'm shuffling my old support case reporting code into a more organized structure. One big change is that I'm trying to use proper classes instead of a big list of functions. Now the big list of functions are a big list of class methods, so maybe I can hang out with the cool OOP kids now, I dunno.

The new Python books are coming in handy. I'm not reading them from start to finish - just using them as a reference.

I feel super bad about neglecting C so far this year. My compromise is that I want to revisit some experiments I ran last summer where I compiled a library in C that I could call from Python. I did the usual "Hello World" stuff, but I want to explore how to create dynamically allocated arrays of stuff in C in a way that Python can then see that data.

I left my C learning progress at a good stopping point. Getting my head around OOP in Python first should help when I revisit Objective-C. Uh, Obj-C and I got in a fight and that's why I walked away from it to cool off. That's a whole different story.

Saturday, January 28, 2012

clever

I've been experimenting with generating HTML for reports. It's pretty easy to do - writing a text file is more or less trivial in Python. Something I had struggled with is having large amounts of HTML text with placeholder variables. It gets unwieldy if you want to make little edits. I ran across this in the Python online documentation:

http://docs.python.org/howto/webservers.html#templates

I had been doing this in my code:

myvariable = "here"
htmltext = "imagine this is long html %s" % myvariable

which would make htmltext be "imagine this is long html here".

The issue is that my HTML is often super long (tables) and making small edits and making sure my variables are in line is annoying. The way they did this in the link is clever.

myvariable = "here"
htmltext = "imagine this is long html %s"
result = htmltext % myvariable

Now I can separate things out for readability. I should read the documentation more often!

Sunday, January 22, 2012

Other activities

I'm about 50 pages into this statistics book. So far I haven't hit anything I haven't seen before. This book does a fairly good job of describing why certain metrics are used, as well as describing ways they can be meaningless.  That being said, it's not nearly as mathematically rigorous (so far) as I had expected. It's pages and pages of text with one or two punchline formulas defining the previously described behavior. I'm not going to knock it too hard yet - I guess I expected more math and I'm curious as to why it's not there. I flipped through the later chapters and found mostly the same pattern.

The one statistics course I took in college wasn't a "stats for engineers" course, but it still had a good deal of math involved. For some reason I was really good at it, and I think the reason is kind of silly. A typical formula for basic statistics (like the one for standard deviation) is kind of "pretty". It has a sigma and summation symbol (upper case sigma?) and I would mindlessly doodle it over and over again in various ways, so I wound up memorizing them accidentally. Over time I've forgotten it, naturally. Math retention has never been my strong suit - quite the contrary to the programming retention I seem to have. Maybe it's a function of practice, not innate ability.

I was figuring that statistics would be mathematically rigorous enough that it would prevent me from tackling anything else while researching it. That doesn't seem to be the case so I'm tempted to bring in something else to do. I haven't stopped programming. I've actually done more programming this new year than usual, but it's in Python and it's work related. It scratches the programming itch just as well, so I'm not jumping at tackling the next chapter in the Obj-C book. Hopefully that doesn't hurt me in the long run.

Saturday, January 21, 2012

Wednesday, January 18, 2012

Good call, Zed.

I got about halfway through Learn Python the Hard Way before I got distracted. I intend to get back into it shortly - particularly for the web programming parts.

There is a chapter called "Advice From An Old Programmer" that I hadn't read until today. One excerpt from this is "Programming as a profession is only moderately interesting. It can be a good job, but you could make about the same money and be happier running a fast food joint. You're much better off using code as your secret weapon in another profession.". I put in bold a sentence that really hit me kind of hard.

I've treated programming as a hobby I might one day (five or ten years from now) turn into a career. Now that I think about it I realize it was a handy tool to have in college, in the lab I used to work in, and in the job I have now. Maybe I can leverage it as my "edge". Maybe everyone should. I've been perpetually bummed that I'm not a programmer or engineer - If I take Zed's advice to heart then maybe I won't have to be anymore.

Tuesday, January 17, 2012

The current Python project

So my goal with Python here is to take a spreadsheet I use at work to document support cases and output it into something more readable. The secondary goal is to extract some useful statistics out of it. The second goal was last year's primary goal which got me into Python in the first place, but I've done a lot of the bean counting in Excel itself (an altogether useful lesson but not journal-worthy).

A good start will be a monthly report. The exercise of only performing actions on specific dicts in the list that have a certain value bound to a key is essential - like what if I only wanted to view cases that are X days old, or are are in a particular region? I think I've solved that problem. The next big deal is to take the data in those dicts and format them into something presentable. I'm thinking initially HTML since I know how to do tables in that and in my security camera project I intermixed variables with static text. If I'm brave I might try converting the dicts to XML for the sake of learning XML, but we'll see. I'm dancing the fine line between hobby project and work.


Python types are killing me

I spent probably two hours to get to this point:




All I wanted to do was import a .csv file as a dict type (values accessible by keys) and then be able to selectively choose which ones to do work on.

That didn't make sense, let's see if I can better explain it.

My data is a spreadsheet exported to .csv. Each row is a set of data with different things I need record of, like a date, name, notes, etc. Each column has a title, and DictReader sees this and makes those titles the keys.

So, if my .csv file looks like this

name, age
chris, 29
bob, 42

the dicts look like:

{'name':'chris', 'age':29}
{'name':'bob','age':42}

So the lets say I have a lot of these records and only want to do something with the people that have ages of 29. The above code will do that for me (just replace "print row" with whatever I really want to do).

It took me forever to get to this point because I'm still not used to handling data in Python. I thought that "reader" was a 2d array of dicts, but it's not. I have to by "Pythony" about things.

Dicts, lists, and tuples. I'm having a hard time getting my head around when to use each. Lets say I had a whole lot of data with lots of ages and I wanted to shuttle all the people of a certain age into their own variable for handling. What type is this variable? Is it a list of dicts? Is it something more akin what "reader" is? Is "reader" that csv.DictReader dumped out a list of dicts? The Python console says no - it's an instance of something. If it's not a data type then how am I able to iterate thought it with "for row in reader"?

Every time I jump into Python I wind up coming out with more questions than answers. I have trouble with Python that I never have with C. I will concede that when I do realize how to do something that it's fairly straightforward and only a few lines of code. That's nice. I might need to just surrender the need to know exactly what's going on and hope that this ignorance doesn't cause a massive bug that I can't track down due to not knowing the internals well enough.

Thursday, January 12, 2012

Tome de Python

I went ahead and ordered that 1100+ page Python book that O'Reilly publishes. There are some Things™ I want to do and I need to just dive in.

The internet is great, but nothing beats a book. Maybe one day I'll be comfortable enough with e-reader tech to not need the dead tree version, but that day hasn't come yet. It wasn't until recently I purchased a laptop - the tech just wasn't at a point I considered worthy of purchasing up until now. Maybe by the time the iPad 5 or Kindle Plasma Storm comes out I'll be ready.

Wednesday, January 11, 2012

Python CSV module DictReader

I'm tossing this link here as a reminder to take a look at it later.

http://www.doughellmann.com/PyMOTW/csv/

One of the reasons I stopped casually learning Python is because the documentation is incomprehensible to me. I wrote about this once before.

I'm hitting a problem at work that plagued me last year, and being able to pick stuff out of a .csv file (or an .xlsx file) would be really nice.

Tuesday, January 10, 2012

Statistics

A coworker let me borrow "Statistical Analysis: An Interdisciplinary Introduction to Univariate & Multivariate Methods" by Sam Kash Kachigan.

I'm going to be picking up some self-learning material for the office. If I like what I see with this book then maybe I won't need to get a larger stats text. We'll see. The other book I want is good linear algebra text book. I still have my calculus book from college.

So yea, hopefully I'll be able to round out my self-learning with some math. We'll see if it sticks as well as programming has.

Tuesday, January 3, 2012

Better linked list plan

I'm going to implement a singly linked list with a void pointer to some arbitrary data. The arbitrary data I want to try is a structure with a pointer to a character array and an integer to hold a count. I want to scan through a text file and count every unique word.

1) Take in a word.
2) See if it's already a known word in the list
2a) If it's not in the list, create a new node and set the counter to 1
2b) If it's in the list then set that node's counter +1
3) Print out the results

An advanced version of this would do something fancy about what order the words are stored in - the first implementation will no doubt be unordered and thus slow after the words go on. I'm not sure yet of the best way to store words such that scanning to find a word is fast. I somewhat recall a way of doing this using a tree structure, but that's after I've gotten more list experience under my belt.

Monday, January 2, 2012

Linked List

I have been trying to stick to this rule where I'll never use a pre-made implementation of an advanced programming concept without first creating my own version of it. The Objective-C stuff I've been working on has me frequently using the Foundation version of a linked list (NSArray and NSMutableArray). I've never created my own linked list implementation so this morning I sat down and hashed out the creation, adding of things, and deletion of a doubly linked list of character pointers. To make it complete I'll have to include insertion and deletion of nodes at any arbitrary point, but that shouldn't be too hard.

Here is what I came up with. My notes of what went right and wrong and some questions to address later are below.

DLinkedList.h

DLinkedList.c

main.c


What went right is that it seems to work. I'd like to add a list traversal function that prints out all the data in the node (node address, head pointer, tail pointer, and data) so that I can verify it's doing exactly what I think it's doing.

What went wrong is that I'm not sure if my AddWord() function is following the best practice for shuffling around variables. I'm always worried that I will treat pointers as special and not do the same thing I'd do if it was an integer. A new node being created in that function is always addressed "through" the node that preceded it. I should have created a temporary node pointer to hold the address of the newly created node and then assign the next/previous/data from that. The weirdness came from literally translating my hand-written notes to code.

What's left is to do the verification, stress test it, and add in some safety checks for all the malloc() calls. Then I can move on to adding in the insert and delete node functions and think of something clever to do with all of it.

I made the nodes rather specifically hold pointers to strings in memory. The next iteration of this needs to make it hold anything. I think I can do this by (in the event of strings) allocating the space, doing strcpy(), and then casting the pointer to void to store in the node. Getting the data back out means needing to re-cast, probably. I messed around with going to and from void a while back and I don't recall having any difficulty.

The list struct which holds the head and tail of the list is so that it's quick to find the last node. Otherwise I would have had to always have the last node's "next" pointer be NULL.

Right, so I think I can give this little side-project another run through in a few days and then I'll feel good about using fancier canned library versions of it!


string size

Here's a quick reminder for myself about getting the size of strings.