| |||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||
Print This Article REALbasic University: Column 107
Debugging Part TwoLast week I talked about some common programming errors that are challenging to debug. Today I'll continue with that, but we'll also explore some techniques you can use to find these bugs.
The Debugging ProcessDifferent programmers work in different ways. Some like to write a lot of code before running the program, while others like to run it frequently during the development. Both approaches work fine and each has their adherents. REALbasic makes the latter approach particularly easy. Interactive development can seem to take longer (it's more stop and go), but has the advantage that if a bug is detected, you've only written a small amount of code and since logic dictates the error is within the code you just added, it makes finding the bug easier. (Keep in mind that occasionally bugs in old code aren't noticeable until you add new code.) However, writing a lot of code without testing it involves more thinking and less trial by error. Sometimes that can be an advantage as you'll find problems before you get half-way done coding them. For example, in the interactive approach you might dive in with an idea and spend an hour writing code only to find you've coded yourself into a corner and your approach just won't work. Thinking about it more you can mentally run the program and find that problem before you've written a single line of code. A good analogy of these two processes is to look at how you solve a math problem. I'll use a simple example. Let's say we're supposed to solve for a in this simple alegbra problem:
Now there are two approaches. You can solve this in the approved step-by-step algebraic method you were taught in school (where you move the a to the other side of the equals sign and change the division to a multiply), or you could use trial-and-error to "guess" at numbers and gradually pick one that works. In the latter approach you could pick a small number, say 10, and put it in as a. Nope, 10 divided by 10 is 1, so that didn't work. Now try a larger number, like 100. That's too big: 100 / 10 is 10. But since that result is double the 5 we're looking for, let's halve it and try 50: hey, that works! 50 divided by 10 is five, therefore a = 50. When I first started learning algebra in high school, I used the latter approach and my teacher didn't like it because even though I got the right answer, I wasn't doing it the "right" way. Of course, my way was only less work on simpler problems, but on those it made sense to me to use it as a shortcut. I liken the trial-and-error method of math solving to the interactive programming style. Instead of thinking out the code in advance, you try stuff, and gradually wittle down to a workable solution. But just like the math situation, this only works well for simpler problems. You'll find with complex problems -- for example, a sort routine -- you can't even test anything until it's completely written. Then if you've made a mistake, it's a pain to go back and try to figure out where you went wrong. I believe these kinds of frustrations result from not knowing both methods of coding. If you understand both the plan-ahead style and the trial-by-error style, you can use whichever you like when that style is appropriate. The trick is being able to identify the problem you're solving and which approach is best. If you try the trial-and-error method with a plan-ahead problem, you're going to be frustrated. Let's use a couple practical examples with some code so I can illustrated what I mean. For the first problem (Problem A), let's say you want a routine that will take some HTML text and strip out all the HTML tags and return just the plain text. For our second problem (Problem B), we'll write a routine to sort an array of numbers. Which approach will we take to each problem? We'll try both approaches! Let's first try the trial-and-error approach to Problem A. We dive in, creating a simple function that accepts a string and returns a string:
Great. Now what? Well, we know we can't modify theText as it's a passed parameter, so we know we'll need another string variable to copy the text into and modify. So let's add that:
Okay, next we know we'll be searching for text enclosed in "<" and ">" characters. So we can search for the first "<", find the ">" and delete everything in between. Something like this:
Note that since I'm writing this off the cuff, with no planning, I've just created two new variables, i and j, without declaring them. Obviously I'll need to do that for this to work. I fix that, but the program won't compile: then I notice my brain morphed two commands into one. My code for deleting the tagged text is totally wrong. It should be t = left(t, i - 1) + mid(t, j + 1). Let's try the new version:
That returns nothing at all? What did I do wrong? Oh, what an idiot! I never return t at the end!
Hooray, it works! Of course it only deletes the first tag, but we can easily rewrite this as a while loop:
Oh great. That didn't work. It produces an infinite loop. What did we do wrong? Now we have to debug. Let's insert a break point and use REALbasic's debugger to see what's going on. Let's break it right at the beginning so we can step through the program step by step. ![]() Here we've added a break point (red dot -- you get it or remove it by clicking on the area left of the code) early on in the routine. The program will pause execution at this point and bring up the debugger so we can look at our variables and see what's going on in the program. This is what the debugger looks like. We've got a list of our routines on the upper left (we're currently inside stripHTML2), a list of variables in the upper right, and our code in the lower pane. The green arrow shows us which line of code is being processed. ![]() As you can see from the above, everything appears to be fine so far. Our variable t is equal to theText, and i is 10, meaning that the first < symbol appears 10 characters into the string. Great, let's continue by pressing the green arrow at the top to advance the program forward a few lines (each click goes forward one line): ![]() Again, this looks great. We can see that t is now different from theText: the first HTML tag has been removed successfully. ![]() But when we go through the loop again, we see a problem. You'll notice that while the value of j changed, the value of i did not. It's still at 10 like it was the first time! Ah ha! There's our problem. We set i at the beginning, before our loop, but we didn't set it again at the end of the loop! A quick copy of that line of code and we've got this:
But oh no! It crashes again. We still have a infinite loop and must force quit our app. What is wrong now? Stepping through the program, everything seems okay. We have to step through a long ways, but suddenly, when i is around 2200, we notice something odd: j is less than i! Of course that's not good, because we're deleting the characters between i and j, and if j is before i, nothing will be deleted. This means that somehow our file contains a > with no matching < sign. Could that be the case? We can look at the text inside t and check. In the debugger, cick on the button at the far right of the varible t and it will bring up a text window. Scrolling through it we stop when we get to the first unremoved HTML tag:
It looks strange, but it's easy to see the problem. This particular sample of HTML is from a previous RBU column and contains some code that includes a > sign. Obviously we have no control over bad HTML files, but we'd better make our routine handle them without crashing! The solution to this problem is simple: instead of searching from the beginning of the file for a > sign, start the search from i. Like this:
Hooray! Now it works fine. But just in case there's some other odd tag situation, let's add an emergency exit:
Okay, that at least works, but our algorithm isn't ideal. We've got some problems which we'll tackle in the next lesson. If you would like the complete REALbasic project file for this week's tutorial (including resources), you may download it here. Next WeekWe polish our StripHTML routine and fix the flaws. LettersToday we've got a note from Charles Szasz, who wonders about creating bar graphs within REALbasic:
Hi Charles! Unfortunately I don't have any code written for doing bar graphs. Depending on what you need (i.e. how fancy you want to make the graphs and how flexible you make a bar graph class) it doesn't sound too difficult. After all, the basic formula is simple enough: you figure out the percentage you need to draw, pass that to a canvas subclass, and then it draws a bar that percentage of its height. If you'd like a commercial solution which would give you the most flexibility and power, check out Graph Pro for REALbasic. It looks like it will graph just about anything. It's written by the esteemed Dr. Gerard Hammond from Australia (whom I had the honor of meeting at this summer's Apple Worldwide Developer's Conference). It costs $75 which may be a lot or not, depending on your needs and how much your own time is worth. There is a demo so you can try it out and at least see if it will work for you. About the Column REALbasic University is a weekly instructional column on programming with REALbasic and is brought to you by REALbasic Developer, the magazine for REALbasic programmers. Each week we answer select reader questions, and we're always open to ideas for future columns. Send your questions to . (Keep your questions simple and specific. General queries like "How do I write my own web browser?" will be neglected.) Your question won't be answered immediately, but will be answered in a future column. (If you don't want your correspondence published, just be sure to indicate that when you write. Otherwise it's fair game.) About the Author See the REALbasic University Archives
REALbasic University contents ©2001-2004 by Marc Zeedar and REALbasic Developer. All Rights Reserved.
| |||||||||||||||||||||||||||