| |||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||
Print This Article REALbasic University: Column 051
SpamBlocker: Part IILast week we started SpamBlocker, a little program to obscure HTML email links within web pages. We started off building our interface and setting the program up to handle text files dropped onto the main window. Now we'll finish up with the code that searches the text and converts the emails to gibberish.
The ParseFile MethodThis is the core of the program. This is where we search through the file and convert any "mailto" URLs we fine. The steps we follow are these:
Create a new method (Edit menu, "New Method") and name it parseFile. For parameters, put in f as folderItem. Click okay. Here's the code:
Woah. It looks like a lot of stuff. But it's not so bad when we break it down. Initially we define the variables we'll be working with. We then set searchString to the actual text we'll be searching for: <a href="mailto:. Note that chr(34) is a double-quote mark, like ". Next, we attempt to open the file as a text file. The variable in is a textInputStream object. If for some reason the file fails to open, in will be set to nil, so skip the file. If it's not nil, we process it. First we read in the entire contents of the file and put the text into s. Then we close the in object (we're done with it). (Technically, the file will eventually be closed automatically by REALbasic, but it's still a good habit to do it yourself.) Now we start another loop. This one searches until it can't find searchString inside s (the file). We tell it this by assigning i to the results to the inStr() function. inStr() returns 0 if it can't find what you're searching for; otherwise it returns the character number where the found string starts.
Assuming our search string was found, i is the starting number. We next want to find the end of the "mailto" URL. Since the HREF element is enclosed in double-quotes, we search for a double-quote (chr(34)). But we don't want to start the search at i -- since it occurs first, the search would stop at the " within searchString! So we add the length of searchString to i, so we start searching at the end of searchString. We put the result of this search into j. The range i to j now represents the email address. The difference between the two variables is the length of the email. We can use that info to extract the email with the formula mid(s, i, j - i). This grabs the string starting at i and it grabs j - i characters.
Once we've got the text we need to convert, we pass that to convertString, a method we'll write in a minute that actually encodes the email. The new string is inserted into the file: s = mid(s, 1, i - len("mailto:") - 1) + replaceString + mid(s, j)
The first mid() function grabs the first part of the file (everything up to the find point, i). By subtracting the length of "mailto:" we get rid of the "mailto:" in the original file and use the one we encoded with convertString. We add in our replaceString (which has been encoded) and then tack on the remaining text in the file. (Remember, if you don't tell mid() to return a certain number of characters, it returns everything to the end of the string.) The result of all this is a new s that contains everything it did before, except the email address itself has been encoded. We then start the search over again, looking for another "mailto" line. Our final step is to write out the new file. We set out to a textOutputStream object, and we only work with it if it doesn't equal nil. If it's a valid textOutputStream, we write all of the new s to the file, erasing the old file in the process. Important Note: for this particular program, I chose to overwrite the original file. This is not smart. When you're testing SpamBlocker, make sure you try it on test or backup files, not irreplaceable files, until you're positive it's working correctly. Once you've overwritten a file, there's no way to get back the original!
The ConvertString MethodWe're almost done: we just need to write the method that actually encodes the email address. This is fairly simple. First, create a new method (Edit menu, "New Method") and set it up like this: ![]() Now add this code:
Ooh, complicated, isn't it! ;-) As you can tell, we pass in the string we're wanting to convert as theString. We obtain the length of it and set up a for-next loop from 1 to that length. Then we examine each letter in the string. For each letter, we put "&#" in front of it plus the ASCII number of the letter and end it with a semi-colon. The mid() returns the individual letter. Asc() returns the ASCII number of that letter. Str() converts that number to a string so we can add it to s. We finish by adding a semi-colon. Once we've processed all the letters in theString, we return s, which contains the same letters encoded as HTML entities. Simple! That should pretty much do it. Go ahead and save and run the program. I've created a testfile.html which you can use to test SpamBlocker. Here's what it looks like after conversion: ![]() It looks weird, but the email links still work fine! How effective this is at stopping Spam is another question, but it can't hurt.
Extending SpamBlockerI've just been running SpamBlocker from within REALbasic, but if you wish, you could compile it to disk as a stand-alone application. If you did that, you'd probably want to make it support dropping files directly onto the SpamBlock icon (instead of just the SpamBlocker window). I leave that as a task for you, but I will give you a few hints:
Another improvement you could make would be to have SpamBlocker support handling dropped folders (right now it only accepts text files). To get that to work you'd have to add a new folder file type (use the popup menu and choose the "special/folder" item), tell the window to accept folder drops, and then recursively parse the contents of the folder(s) dropped. So those enhancements are your homework assignments, if you're so inclined. If you would like the complete REALbasic project file for this week's tutorial (including resources), you may download it here.
Next WeekSomething cool and exciting, of course!
LettersOur first letter this week, concerns last week's column. REALbasic guru Thomas Tempelmann alerted me to a technicality regarding my definition of ASCII.
Thanks for the clarification, Thomas. My explanation of ASCII in the RBU glossary is more accurate, but I tend to lump all character numbers under ASCII, and that's technically inaccurate. Next, we've got a question from a medical doctor:
Unfortunately, I haven't seen anything like the "column view" structure you're looking for. It sounds like a great thing, and it seems like someone should have created a class that does this, but I'm not aware of it. That doesn't mean it doesn't exist: if anyone out there knows of a REALbasic class that implements a Mac OS X column view, let me know and I'll publish something about it here. Meantime, you could simply use a dynamic listbox that changes with what the user types (similar to Claris Emailer's address book display of names that match the letters you type). I can, however, help you with the second part of your question. My own Grayedit class gives you way to add autocompletion to your own projects. You feed it a list names and when the user starts to type in a name, the completed name is displayed in gray, like this: ![]() Pressing the tab key "fills in" the full name in black. Grayedit is a free class, so download the sample project and try it out. It does require an array of names, so you'd have to create a method to dynamically generate that list from your SQL database (which doesn't sound difficult). About the Column REALbasic University is a weekly instructional column on programming with REALbasic and is brought to you by REALbasic Developer, the magazine for REALbasic programmers. Each week we answer select reader questions, and we're always open to ideas for future columns. Send your questions to . (Keep your questions simple and specific. General queries like "How do I write my own web browser?" will be neglected.) Your question won't be answered immediately, but will be answered in a future column. (If you don't want your correspondence published, just be sure to indicate that when you write. Otherwise it's fair game.) About the Author See the REALbasic University Archives
REALbasic University contents ©2001-2004 by Marc Zeedar and REALbasic Developer. All Rights Reserved.
| |||||||||||||||||||||||||||||||