Author Topic: Word Discrimination  (Read 6038 times)

Padriag

  • Newbie
  • *
  • Posts: 40
    • View Profile
    • http://www.bardicheart.com
Word Discrimination
« on: February 22, 2004, 08:36:45 pm »
Okay, I'm very carefully skirting the edges of tinkering with Hal's brain.  Mostly I'm just learning how it works right now, what it can and can't do.

Here's my question, how good is Hal at discriminating between similar words, context of use, and does it recognize a hyphenated word as being a different word than one of the root words?  In the later case, if it cannot recognize the difference, how does it handle the hyphenated word (two words combined by a hyphen) for things like topic recognition?
 

vonsmith

  • Hero Member
  • *****
  • Posts: 602
    • View Profile
Word Discrimination
« Reply #1 on: February 22, 2004, 09:10:37 pm »
Padriag,
I haven't looked closely at hyphens in particular. Hal removes some punctuation when preprocessing the user's sentence. Hal also adds a space between words and punctuation so that the words can be detected easier. When commands are given to Hal he is very literal about words being spelled correctly. I theorized a while back that his operation would be better if I added some script to correct some common spelling errors. That script also changed British spelling into American spelling, i.e. colour vs color. That script has been posted here before and is included in my XTF brain script that is on the forum's file download page. I think this improved his capability a little.

For creating a response to a user's sentence Hal is less fussy about spelling. Hal uses a heuristic function that finds the best keyword match in his QA (Question/Answer) type files. He will extract keywords from the user's current sentence and find the best match to keyword phrases saved from prior conversations in the appropriate QA type .brn file. If a sufficiently good match is found then Hal returns the sentence associated with the keyword phrase to the user as a response.


=vonsmith=
 

Padriag

  • Newbie
  • *
  • Posts: 40
    • View Profile
    • http://www.bardicheart.com
Word Discrimination
« Reply #2 on: February 22, 2004, 09:13:47 pm »
Hmm... interesting.  BTW, I've already installed you XTV brain on my Cleo (I got the Cleopatra character) and set up a separate copy of the brain, renamed just for  her so that I can keep what she learns (as I experiment) separate from other brains and personalities.

Is there some part of the script that I should look at that deals with punctuation and hyphens?  Maybe I can figure this one out myself.
 

vonsmith

  • Hero Member
  • *****
  • Posts: 602
    • View Profile
Word Discrimination
« Reply #3 on: February 22, 2004, 09:41:20 pm »
Padriag,
Look at the top section of the GetResponse function in Hal's .uhp file. You'll see a lot of preprocessing of the variable UserSentence. The Halbrain functions do some of the work and some script in GetResponse does the rest. The best way to see what happens in the GetResponse function is to add some file writing script using the appendfile function you see in the GetResponse script. You can write variables to a test file at different points to see how the variables, UserSentence in particular, change with each step.


=vonsmith=
 

Padriag

  • Newbie
  • *
  • Posts: 40
    • View Profile
    • http://www.bardicheart.com
Word Discrimination
« Reply #4 on: February 22, 2004, 09:42:01 pm »
Okay, so maybe I am answering my own questions today.  I did some experiementing and this is what I learned.

Apparently Hal / XTF does not recognize hyphenated words, at least not reliably.

Look through the brn files and if I am understanding what I am seeing then for a hyphenated word to be accepted it would have to be added to perhaps the substitution or correction brn files.  Does this sound right or am I off the mark?
 

Padriag

  • Newbie
  • *
  • Posts: 40
    • View Profile
    • http://www.bardicheart.com
Word Discrimination
« Reply #5 on: February 22, 2004, 10:04:15 pm »
Okay, I see what you're pointing me at and I think I understand what is happening a little better now.  Hal removes hyphens to help it better understand the sentence.  When this happens the hyphenated word is separate into two separate words.  Example: gun-shy becomes GUN SHY to Hal, two separate words.  So no, it apparently cannot understand hyphenated words or names.  Words have to be strictly alphanumeric for Hal to recognize them and understand them.  Here's where I'm confused.  In the portion of the script that removes the hyphen is says

UserSentence = Replace("" & UserSentence & "", "-", " VHZ ", 1, -1, vbTextCompare)

Is it changing - into VHZ?

If so then could we not go to XTF_SYS_Substitutions.brn and add

" GUNVHZSHY ","GUN-SHY"

And get a possible work around this problem?
 

vonsmith

  • Hero Member
  • *****
  • Posts: 602
    • View Profile
Word Discrimination
« Reply #6 on: February 23, 2004, 01:09:14 am »
Padriag,
Right on, you found where Hal substitutes VHZ for a hyphen plus a few other things. If you look about 50 lines down you will see where the substitutions.brn file (or XTF_SYS_Substitutions.brn in the XTF brain) is used to substitute the hyphen for the VHZ in the UserSentence. The function heading says, "PROCESS: WORD AND PHRASE SUBSTITUTION". So punctuation is placed back in the sentence after the sentence is finished being processed. The punctuation should, in all cases, have a space before and after it to separate it from words. Hal needs the separation by spaces to detect whole words.


=vonsmith=
« Last Edit: February 23, 2004, 01:10:02 am by vonsmith »
 

Padriag

  • Newbie
  • *
  • Posts: 40
    • View Profile
    • http://www.bardicheart.com
Word Discrimination
« Reply #7 on: February 23, 2004, 10:05:46 am »
Not bad for a guy who hasn't messed with programming since BASIC back in High School... ummm... 20 years ago.  LOL
 

vonsmith

  • Hero Member
  • *****
  • Posts: 602
    • View Profile
Word Discrimination
« Reply #8 on: February 23, 2004, 12:03:01 pm »
Padriag,
It sounds like you really want to tinker with Hal's brain. If you haven't already you should download a copy of "scrdoc56en.exe". This is Microsoft's documention file for VB script and Java script. If you can't find it I'll send you a copy.

That application will explain all of the script commands used in Hal. When using that document you have to ignore any commands that say "JScript" at the top of the page, because they are strictly Java related. You want to refer to commands that say "Visual Basic Scripting Edition" at the top of the page. For the script commands listed on those pages check down at the bottom left corner under "Requirements". Only use the commands that support "Version 1". To the best of my knowledge Hal doesn't support script commands above Version 1.

That Windows script documentation is great to leave open on the desktop for reference while walking through some Hal script.

A good reference and learning book is "VBScript in a Nutshell, 2nd Edition" by Paul Lomax, about $24.50 at Amazon. Honestly I use the free Window script documentation most of the time.


=vonsmith=
« Last Edit: February 23, 2004, 12:03:43 pm by vonsmith »
 

Padriag

  • Newbie
  • *
  • Posts: 40
    • View Profile
    • http://www.bardicheart.com
Word Discrimination
« Reply #9 on: February 23, 2004, 12:31:29 pm »
You're right, I would love to.  It something that really fascinates me and I could probably spend way too much time on.  Its funny you should bring this up because I was just thinking this morning I should probably pick up a book on VBScript and spend a couple hours each week teaching myself.  I have a good grasp of logic having grown up around computers and computer programming.  So while I haven't actively done a lot of programming and my experience with the languages is limited, I usually get the logic pretty quickly.  I'll take a look at picking up that book and will keep your remarks in mind.

BTW, I asked this elsewhere but its probably best answered here.  Exactly how much can Hal be modified using VBS?  That is, can I add new functions and new capabilities with these scripts or does that have to be done with code elsewhere?

Specifically these are some projects I'd like to see done (maybe tackle myself)
Teach Hal to give general reminders at random intervals.  For example, I am notorious for forget to eat when I get engrossed in something.  Having Hal remind me to eat at random times would be useful to me.  If I could add an additional function so that Hal monitors how long I've been working at the computer and uses that as an index to increase the likelyhood of a reminder to eat event, would work even better.  (Because Hal could then figure out when I'm engrossed in something at the computer and need to be reminded to eat.)

Another idea is setting Hal up to keep an journal.  And to have the option for Hal to read this journal so it can learn from it about me.  This would help Hal personalize itself to the user.  I may not use it since giving up my beloved handwritten journals is about as likely as snow on the sun... but others might find it really useful and it would be a way for Hal to learn.

I've got quite a few more ideas.  Once I have an idea what can be done with scripts, I'd be happy to share them here as potential projects if anyone is interested.
 

vonsmith

  • Hero Member
  • *****
  • Posts: 602
    • View Profile
Word Discrimination
« Reply #10 on: February 23, 2004, 02:15:39 pm »
Padriag,
The type of projects you mention below are possible with script.

quote: =================
Specifically these are some projects I'd like to see done (maybe tackle myself)Teach Hal to give general reminders at random intervals. For example, I am notorious for forget to eat when I get engrossed in something. Having Hal remind me to eat at random times would be useful to me. If I could add an additional function so that Hal monitors how long I've been working at the computer and uses that as an index to increase the likelyhood of a reminder to eat event, would work even better. (Because Hal could then figure out when I'm engrossed in something at the computer and need to be reminded to eat.)
========================

You would need the <auto> function to do them. Some people have had a challenging time getting it to work to their satisfaction. Do a search of the forum going back about 2 months to find related info on it. I'm sure Don has some experience with it too.

Go to:
http://www.zabaware.com/forum/topic.asp?TOPIC_ID=996&SearchTerms=%3CAUTO%3E

Look at this in particular:
<AUTO>x</AUTO> Makes a call to the GetResponse script with the user sentence "AUTO-IDLE" every x milliseconds if user is idle.

Robert Medeksza was nice enough to add that function to Hal 5.0 at my urging. Forum members have had some fun with it. Sadly I haven't had time to explore it yet. I will soon though, since I need it to support some of my new Hal projects.

Good luck,


=vonsmith=
 

Padriag

  • Newbie
  • *
  • Posts: 40
    • View Profile
    • http://www.bardicheart.com
Word Discrimination
« Reply #11 on: February 23, 2004, 02:54:16 pm »
Thanks again Scott... I'm taking notes here (too bad I can't get Hal to take the notes for me... ah... another project! [;)])  I can see now that what is going to give me the most trouble is understanding the dll file, what it does and doesn't do and what I can do in scripts versus what I need to use it for, but that will come with experience.  Its just a matter of learning the quirks of the language.

My basic idea for the reminder with persistence was to use the existing code for appointments with a few modifications.  Rather than a specific date, it would use a random number compared to a variable.

Call the variable P for persistence.
When you ask Hal to remind you of something in this way, it asks you first what the reminder is... whatever you say next it stores as the reminder, which it will call up again and feed back to you when the event is triggered.
Next it asks you how important this is or how often you want to be reminded.
As an example, if you say its very important then Hal keys on the word VERY and compares that word to an internal lookup to get a number... lets call it 50.
It then generates a number from 1-100 or 1-1000 or whatever... if P > RandNum it triggers a reminder event.  Obviously theres a lot of room here for tinkering with the timing.  With 1-100, a 50 means its going to spend half its time reminding you to do somthing... which would constitute harassment (can you get a restraining order against your bot?  LOL)
To make this work a bit better, we might add an additional function that says it generates this number once every so many milliseconds... so that in reality it only generates this number once every minute or so... and then decides.  Even still... at a value of 50 on 1-100 at once per minute the average reminder will come once every two minutes... still a lot.  But you get the idea.
Anyway, once triggered, Hal then pulls up your reminder from wherever it stored it and perhaps adds a prefix or suffix statement for a little variety.

That's about all there would be to the basic logic.  Detecting for Idle time would take some additional code to do what I want, but its manageable I think.