REALbasic University Resources:

RBU: Glossary Defines common REALbasic programming terms
  Archives Previously published columns
Translations: Dutch Courtesy of Floris van Sandwijk
  Japanese Courtesy of Kazuo Ishizuka
  Chinese Courtesy of Dong Li
  RBU Translation Guide Information on Translating RBU into other languages
Books: Matt's Book (2nd Edition!) Ideal for experienced programmers
  Erick's Book Best for beginning programmers
Websites: Mother Ship The publisher of REALbasic
  RB Webring Links to hundreds of REALbasic websites
  RESExcellence Another REALbasic programming column
  REALbasic Developer Magazine The premiere source for REALbasic instruction.

REALbasic University is Sponsored by

Make your Mac do what YOU want it to. Create games, utilities, cool Mac OS X tricks. Download REALbasic now and create your own software.


Print This Article

REALbasic University: Column 051

SpamBlocker: Part II

Last week we started SpamBlocker, a little program to obscure HTML email links within web pages. We started off building our interface and setting the program up to handle text files dropped onto the main window. Now we'll finish up with the code that searches the text and converts the emails to gibberish.

The ParseFile Method

This is the core of the program. This is where we search through the file and convert any "mailto" URLs we fine.

The steps we follow are these:

  1. Load in the text file
  2. Find and replace all the "mailto" references
  3. Save out the new file

Create a new method (Edit menu, "New Method") and name it parseFile. For parameters, put in f as folderItem. Click okay.

Here's the code:

  
dim in as textInputStream
dim out as textOutputStream
dim s, searchString, replaceString as string
dim i, j as integer

searchString = "<a href=" + chr(34) + "mailto:"
in = f.openAsTextFile
if in <> nil then
s = in.readAll
in.close

i = inStr(s, searchString)
while i > 0
replaceString = ""
i = i + len(searchString)
j = inStr(i, s, chr(34))
if j > 0 then
// Found one, so convert it
replaceString = "mailto:" + mid(s, i, j - i)
replaceString = convertString(replaceString)

// Insert it
s = mid(s, 1, i - len("mailto:") - 1) + replaceString + mid(s, j)
end if

i = inStr(i, s, searchString)
wend

out = f.createTextFile
if out <> nil then
out.writeLine s
out.close
end if
end if // in = nil

Woah. It looks like a lot of stuff. But it's not so bad when we break it down.

Initially we define the variables we'll be working with. We then set searchString to the actual text we'll be searching for: <a href="mailto:. Note that chr(34) is a double-quote mark, like ".

Next, we attempt to open the file as a text file. The variable in is a textInputStream object. If for some reason the file fails to open, in will be set to nil, so skip the file. If it's not nil, we process it.

First we read in the entire contents of the file and put the text into s. Then we close the in object (we're done with it). (Technically, the file will eventually be closed automatically by REALbasic, but it's still a good habit to do it yourself.)

Now we start another loop. This one searches until it can't find searchString inside s (the file). We tell it this by assigning i to the results to the inStr() function. inStr() returns 0 if it can't find what you're searching for; otherwise it returns the character number where the found string starts.

In Detail

If you're not that familiar with inStr(), put this code in a button and try it:

  
// Returns 2
msgBox str(inStr("Mischief is my cat.", "is"))

// Returns 9
msgBox str(inStr("Mischief is my cat.", " is"))

// Returns 1
msgBox str(inStr("Mischief is my cat.", "Mischief"))

// Returns 9
msgBox str(inStr("Mischief is my cat.", " "))

// Returns 0
msgBox str(inStr("Mischief is my cat.", "q"))

Assuming our search string was found, i is the starting number. We next want to find the end of the "mailto" URL. Since the HREF element is enclosed in double-quotes, we search for a double-quote (chr(34)). But we don't want to start the search at i -- since it occurs first, the search would stop at the " within searchString!

So we add the length of searchString to i, so we start searching at the end of searchString. We put the result of this search into j. The range i to j now represents the email address. The difference between the two variables is the length of the email.

We can use that info to extract the email with the formula mid(s, i, j - i). This grabs the string starting at i and it grabs j - i characters.

In Detail

This may not be clear, so here's a diagram.

Assume the text is our file. At first i would be equal to 19, then 35 when we add the length of searchString to it. Since the end quote mark is at 51, and 51 - 35 = 16, if we grab the 16 characters starting at 35, we've got the text highlighted in yellow!

Once we've got the text we need to convert, we pass that to convertString, a method we'll write in a minute that actually encodes the email.

The new string is inserted into the file:

 s = mid(s, 1, i - len("mailto:") - 1) + replaceString + mid(s, j) 

The first mid() function grabs the first part of the file (everything up to the find point, i). By subtracting the length of "mailto:" we get rid of the "mailto:" in the original file and use the one we encoded with convertString. We add in our replaceString (which has been encoded) and then tack on the remaining text in the file. (Remember, if you don't tell mid() to return a certain number of characters, it returns everything to the end of the string.)

The result of all this is a new s that contains everything it did before, except the email address itself has been encoded.

We then start the search over again, looking for another "mailto" line.

Our final step is to write out the new file. We set out to a textOutputStream object, and we only work with it if it doesn't equal nil. If it's a valid textOutputStream, we write all of the new s to the file, erasing the old file in the process.

Important Note: for this particular program, I chose to overwrite the original file. This is not smart. When you're testing SpamBlocker, make sure you try it on test or backup files, not irreplaceable files, until you're positive it's working correctly. Once you've overwritten a file, there's no way to get back the original!

The ConvertString Method

We're almost done: we just need to write the method that actually encodes the email address. This is fairly simple.

First, create a new method (Edit menu, "New Method") and set it up like this:

Now add this code:

  
dim i, n as integer
dim s as string

s = ""
n = len(theString)
for i = 1 to n
s = s + "&#" + str(asc(mid(theString, i, 1))) + ";"
next

return s

Ooh, complicated, isn't it! ;-)

As you can tell, we pass in the string we're wanting to convert as theString. We obtain the length of it and set up a for-next loop from 1 to that length. Then we examine each letter in the string.

For each letter, we put "&#" in front of it plus the ASCII number of the letter and end it with a semi-colon. The mid() returns the individual letter. Asc() returns the ASCII number of that letter. Str() converts that number to a string so we can add it to s. We finish by adding a semi-colon.

Once we've processed all the letters in theString, we return s, which contains the same letters encoded as HTML entities. Simple!

That should pretty much do it. Go ahead and save and run the program. I've created a testfile.html which you can use to test SpamBlocker. Here's what it looks like after conversion:

It looks weird, but the email links still work fine! How effective this is at stopping Spam is another question, but it can't hurt.

Extending SpamBlocker

I've just been running SpamBlocker from within REALbasic, but if you wish, you could compile it to disk as a stand-alone application. If you did that, you'd probably want to make it support dropping files directly onto the SpamBlock icon (instead of just the SpamBlocker window).

I leave that as a task for you, but I will give you a few hints:

  • You'll need to add an application class object to SpamBlocker.
  • You'll put the code that handles dropped files within the app's OpenDocument event.
  • You'll need to check the "icon" checkbox within the file type dialog (if you don't, your compiled app won't accept dropped files).

Another improvement you could make would be to have SpamBlocker support handling dropped folders (right now it only accepts text files). To get that to work you'd have to add a new folder file type (use the popup menu and choose the "special/folder" item), tell the window to accept folder drops, and then recursively parse the contents of the folder(s) dropped.

So those enhancements are your homework assignments, if you're so inclined.

If you would like the complete REALbasic project file for this week's tutorial (including resources), you may download it here.

Next Week

Something cool and exciting, of course!

Letters

Our first letter this week, concerns last week's column. REALbasic guru Thomas Tempelmann alerted me to a technicality regarding my definition of ASCII.

The ASCII code only defines the "simple" U.S. characters, with codes from 0 to 127, nothing more.

So, you can not call the #169 code an ASCII code, because there is no such ASCII table defining that code. I believe the best term would be to call it the ANSI 8 bit char set, even though both ANSI and ISO are standards committees that have numbers for the individual standards. So, the correct ref would be something like ANSI 8997 or so. Just that I don't remember the number for ANSI and ISO.

Thomas

Thanks for the clarification, Thomas. My explanation of ASCII in the RBU glossary is more accurate, but I tend to lump all character numbers under ASCII, and that's technically inaccurate.

Next, we've got a question from a medical doctor:

Hi-

I am an old-time programmer (FORTRAN, Pascal, BASIC era...) who is learning a few new tricks. I'm delving in to REALbasic to implement a few projects to make my medical office work more efficiently. Naturally, if I can make these work well enough and make them usable enough, I hope to market them to other physicians too.

My problem in a nutshell is this. I want to implement a database of drug names in a hierarchical fashion to ease selection of a particular drug for prescription writing, record-keeping, etc. The hierarchy looks like this:

  
DrugClass0
DrugSubClass0
DrugName0
DrugName1
...
DrugNameN
DrugSubClass1
DrugName0
DrugName1
...
DrugClass1
DrugSubClass0

I'm looking for a database structure and RB code to implement something like the "column view" in OS X for selection of a drug. Do you know if there is anything out in the RB universe that does this?

In the same vein (that's a medical joke, BTW), I'd like to allow a user to start typing a medication name and to have the software automatically complete (or suggest in dimmed characters a completion) of the name with a popup list of possibilities based on the first letters typed. You obviously know what I'm aiming at. I know that this is a different question than my first one. I can visualize a possible solution using a database implementation and a SQL select statement using wild cards for completion of the entry. Do you know if there is code existing that I might find and use so that I don't have to reinvent the wheel?

Thank you very much for your kind assistance!!

Harlan R. Ribnik, M.D.

Unfortunately, I haven't seen anything like the "column view" structure you're looking for. It sounds like a great thing, and it seems like someone should have created a class that does this, but I'm not aware of it. That doesn't mean it doesn't exist: if anyone out there knows of a REALbasic class that implements a Mac OS X column view, let me know and I'll publish something about it here.

Meantime, you could simply use a dynamic listbox that changes with what the user types (similar to Claris Emailer's address book display of names that match the letters you type).

I can, however, help you with the second part of your question. My own Grayedit class gives you way to add autocompletion to your own projects. You feed it a list names and when the user starts to type in a name, the completed name is displayed in gray, like this:

Pressing the tab key "fills in" the full name in black.

Grayedit is a free class, so download the sample project and try it out. It does require an array of names, so you'd have to create a method to dynamically generate that list from your SQL database (which doesn't sound difficult).


About the Column
REALbasic University is a weekly instructional column on programming with REALbasic and is brought to you by REALbasic Developer, the magazine for REALbasic programmers.

Each week we answer select reader questions, and we're always open to ideas for future columns. Send your questions to . (Keep your questions simple and specific. General queries like "How do I write my own web browser?" will be neglected.) Your question won't be answered immediately, but will be answered in a future column. (If you don't want your correspondence published, just be sure to indicate that when you write. Otherwise it's fair game.)

About the Author
is an author, philosopher, graphic designer, photographer, film director, soccer fanatic, and programmer (among other things). He writes for MacOpinion, runs his own software company, Stone Table Software, which sells the revolutionary Z-Write word processor, and is Publisher and Editor of REALbasic Developer. He lives in Northern California with his cats, Mischief and Mayhem, and is rapidly running out of free time.

See the REALbasic University Archives


REALbasic University contents ©2001-2004 by Marc Zeedar and REALbasic Developer. All Rights Reserved.

.

.