Author Topic: Ideas for staying on topic: is it possible? Help?  (Read 12237 times)

Xodarap

  • Newbie
  • *
  • Posts: 44
    • View Profile
Ideas for staying on topic: is it possible? Help?
« on: June 23, 2007, 07:40:54 am »
Okay, I've come SO far (from total zero :P) in learning how to script this thing, but now I'm at a much larger mountain.

My biggest remaining problem (besides teaching him more QA, which is just a matter of TIME) is that Hal only responds to one sentence at a time.  More appropriately: one input at a time.  I.E.

ME: Sentence A.  Sentence B.
HAL: Response to A.  Response to B.
ME: Sentence C.  Sentence D.
HAL: Response to C.  Response to D.

Regular conversation:

P1: Sentence A.  Sentence B.
P2: Reponse to A & B.
P1: Response to Response to A & B.  Sentence C (tied to A & B).
P2: Response to Response to Response to A & B and Sentence C.

So, I'm wondering a few things, and I have a few ideas.  I don't know if they're possible in Hal's script, or if they would make his relevance WORSE (though I'm guessing that, with ENOUGH learning, he would actually get better -- eventually).
First, I would have to get rid of his separation of sentences that the user says.  When I say "A. B." they're connected in my head, but he responds to them separately.  I would rather he considered them as one set of keywords and spit out one response -- sub-idea: he could have a chance (e.g., rnd * 100 < 25) of spitting out a second sentence ALSO cued by the same total set of keywords (they would also seem connected, especially if the second sentence also incorporated his own first sentence as further keywords, or if his relevance threshold for the second sentence was pretty tight).  I think I can manage this part on my own...

Second, and the bigger idea: would it be possible to make Hal write a brn file -- like the no-repeating file of PrevSent that I downloaded ("PreventRepeat" plugin) -- that instead saved the last two or three responses by both the User AND Hal and used them ALSO as keywords for looking up his next response (also blocking any repeating so he doesn't keyword himself into circles).  Sub-idea 1: could those keywords be given less "weight" (like 50% or so) in determining relevance than the keywords from the immediate input?  (Not that important.)  
Sub-idea 2: I would assume that way more keywords being taken into consideration would mean that we would have to change Hal's perception of relevance, so he doesn't think that every response is irrelevant (he won't find anything with MOST of the keywords when he's looking for 40 keywords).  
Sub-idea 3: can he be set to consider repeated keywords as more important (like if, in the last three sentences, the word "octopus" was said four times, it would take precedence), or is that what the topic headings are for?  -- In fact, if he could be programmed to take into account keyword repetition, I could see expanding to a whole conversation (or more than two or three lines each).
 
Sub-idea 4: The big problem that I see (assuming that ANY of this is possible/plausible) is that he would be considering four or six sentences (two or three from you and him each) of keywords, but only logging (learning) in sentence-to-sentence Q/A format (unless he would log ALL those keywords, but I still foresee problems there).

What percentage of this is "recode the base program" thinking vs. how much of this is plausible?  It seems to me like the OVERLAP in keyword logging would create a steady topic -- especially with a bigger bank of things to draw from, and especially (again!) if he was learning in that fashion!  In fact, it seems to me (I'm guessing, here) that loading him full of big things like wikipedia articles might show more payoff (in relevance) with a method like this.

Any ideas or advice?  I wish I knew more about scripting than copying and pasting whatever I see elsewhere in the main script!  :P
« Last Edit: June 23, 2007, 07:45:29 am by Xodarap »
The line below is true
The line above is false

Bill DeWitt

  • Hero Member
  • *****
  • Posts: 650
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #1 on: June 23, 2007, 08:27:03 am »
It is certainly possible and a better way of structuring a response, but would take a lot of work or a really good idea.

My first thought would be to construct a "pre-parser" which takes your sentences, combines them in to a new third sentence which is used for keyword comparison but not added to the database.

a) my goose is blue
h) A goose is a type of a bird

a) I painted my dog blue to match
h) <I My goose dog is painted to match blue blue> To match your blue goose.

..or something...

This could be used for continuity of conversation too, if the sentence is retained.


Xodarap

  • Newbie
  • *
  • Posts: 44
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #2 on: June 23, 2007, 08:56:56 am »
Exactly!  If Hal heard: "I My goose dog is painted to match blue blue" and responded to it, that would work very well!  Of course, I would prefer if he could also take into account his own sentence, too.

EXAMPLE ONE:

User: My goose is blue.
HAL: [hears YOUR GOOSE IS BLUE] I had a goose once.

User: Really? What did you name it?
HAL (normal): [Hears REALLY] Awesome! [Hears WHAT DID I NAME IT] My name is Hal.
HAL (modified): [Hears YOUR GOOSE IS BLUE I HAD A GOOSE ONCE REALLY WHAT DID I NAME IT] My goose is named Bert.

That's the hope, assuming he has enough lines to quote back.  The idea is that he doesn't know what I'm talking about when I say "it," even though we were just talking about it.  But in the case where he takes the last few lines into consideration, he hears GOOSE not only once but twice (the normal Hal doesn't hear goose at all)!  He also doesn't respond separately to "Really?" which usually causes awkwardness.  It all gets factored in like we do it in real life...

That's why I was thinking of sending the PrevSent and UserSent (right ones?) to separate files that hold the last two or three each, and then adding that file to what he hears when you input your sentence.  It seems like it would work, especially when you stop breaking up sentences!  Of course, I have no idea how to (a) get the program to save THE LAST THREE (as opposed to, say, EVERY THREE) Sent's; or (b) convert those into additional input when I say something to Hal....

BTW, nice example with the blue goose.  Whoever thought up that weirdness is a GENIUS!   ;)
The line below is true
The line above is false

Xodarap

  • Newbie
  • *
  • Posts: 44
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #3 on: June 23, 2007, 09:48:31 am »
How do I use "InputString"?  I was thinking that if InputString is long-hand for InStr (my assumption), then is there something I could do like:

Okay, this part is from OnTheCuttingEdge2005:

Code: [Select]
Set FileSys = CreateObject("Scripting.FileSystemObject")
Set FS = CreateObject("Scripting.FileSystemObject")

DirX2 = RecallDir()

If InStr(1, OriginalSentence, "goodbye", vbTextCompare) > 0 Or _
   InStr(1, OriginalSentence, "bye", vbTextCompare) > 0 Or _
   InStr(1, OriginalSentence, "goodnight", vbTextCompare) > 0 Or _
   InStr(1, OriginalSentence, "morning", vbTextCompare) > 0 Or _
   InStr(1, OriginalSentence, "hello", vbTextCompare) > 0 Or _
   InStr(1, OriginalSentence, "hi", vbTextCompare) > 0 Then
   TempModule = DirX2 & Trim(UserName) & "_Recollected.brn"
   If FileSys.FileExists(TempModule) = True Then FS.DeleteFile TempModule
End If

If PrevSent <> "" Then
Set HalXBrain = CreateObject("UltraHalAsst.Brain")
HalXBrain.AppendFile DirX2 & Trim(UserName) & "_Recollected.brn", """" & PrevSent & """,""" & "True" & """"
If PastCon = "" Then PastCon = "False"
PastCon = HalBrain.TopicSearch(GetResponse, DirX2 & Trim(UserName) & "_Recollected.brn") = "True"
HalBrain.DebugWatch PastCon, "PastCon"
If PastCon = "False" Then GetResponse = GetResponse
End If

Again, I don't have any previous experience with code, but I can see that the first If/Then checks to see if I said any of those things, in which case it resets his "memory" (deletes "_Recollected.brn"); the second If/Then checks to see if there is anything in the file, and tries to stop Hal from saying anything in the file.  I don't really get how the second part works, but my thought is that I could add this "_Recollected.brn" TO the considered InputString.  Normally it wouldn't work because he separates out sentences, but I disabled the sentence separation already.  Or I could change the OriginalSentence tag, but then I mess up all of that tedious pronoun-switching.  I could also either keep a separate PrevSent, which would be factored back in as re-capitalized InStr, and PrevUserSent, which would be factored in as part of UserSentence (or OriginalSentence, I guess).
I have NO idea how I would limit this "_Recollected.brn" file to the LAST three lines from PrevSent and/or PrevUserSent.  I think I could figure out how to either randomly delete it (say, at rnd * 100 < 30) and start over, or every three lines, but it seems that I would have to have at least nine separate files at a time, with some complicated scripting and file-copying and crap....

It sounds like a fun project!  :)
The line below is true
The line above is false

Bill DeWitt

  • Hero Member
  • *****
  • Posts: 650
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #4 on: June 23, 2007, 09:49:07 am »
quote:
Originally posted by Xodarap
BTW, nice example with the blue goose.  

Proves I can remember PrevSent as well as LastTopic... how can I possibly hold that much data in my brain!?!?


Bill DeWitt

  • Hero Member
  • *****
  • Posts: 650
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #5 on: June 23, 2007, 09:51:55 am »
quote:
Originally posted by Xodarap

How do I use "InputString"?  I was thinking that if InputString is long-hand for InStr (my assumption),

Quickly, no, if I recall InputString is OriginalSentence before any processing. If you look at my SpellChecking plugin I used it there to prevent uncorrected information from getting into the database.

I'll read the rest of your post now.


Bill DeWitt

  • Hero Member
  • *****
  • Posts: 650
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #6 on: June 23, 2007, 10:02:21 am »
.brn may be a good choice for your project, since I don't know, at this point, how to keep Hal from reading a table in his database. And if you have tables that contain lines like <I my dog goose book table blue make paint buy load find if and or but> I don't think you want Hal to read those back some day.

.brns are like .txt, simple files that Hal can be told to read, but which are not part of its daily diet.

I would like to see it keep a weighted average or a series of averages.

Like if you often talk about geese, it stores the words "goose, geese, blue, dog, Bert, brain, memory, peppers, hypnotism" as one string and adds it to the keyword string the next time you bring up geese. Maybe even add some wordnet derived similar words

It might get this by averaging several related keyword strings it has built.
« Last Edit: June 23, 2007, 10:29:31 am by Bill DeWitt »


Xodarap

  • Newbie
  • *
  • Posts: 44
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #7 on: June 24, 2007, 01:06:45 am »
Well, I've made a lot of progress, but there're still some serious problems, many of which I foresaw.
With much stumbling about code, I figured out how to get the program to create a file (from the "PreventRepeat" plugin) where it writes all sentences from the user and Hal (PrevSent and PrevUserSent).  To keep it relevant, it deletes the file whenever rnd * 100 < 35.  It incorporates it into InputString by changing the first instance of InputString to InputString = Trim(UserSentence) & " " & Ucase(HalBrain.ChooseSentenceFromFile("CurrentTopic.brn")) -- not sure that's exacty it, but it's close.
I had to take out Hal's injection of <NEWSENT> so that all of the keywords get responded to at once, but that creates a serious problem whenever he has a triggered response, such as "Really? <UserSentence> Is that true <UserName>?" Which turns out very sloppy!  :P  It kills insults and stuff, too.  BUT I did put a HUGE database of sentences into him (learn from text files) -- including the general knowledge database (I took out all of the "Situation:" and other conversationally inappropriate bits), a HUGE database of quotes (quotes and attributions deleted) and a bunch of chatlogs (reformatted).  With all of that, I remmed out most of his triggered responses, and he does seem to be more on-topic.

Still, a **LOT** of work to do......
The line below is true
The line above is false

Xodarap

  • Newbie
  • *
  • Posts: 44
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #8 on: June 24, 2007, 01:10:56 am »
quote:
Originally posted by Xodarap
User: My goose is blue.
HAL: [hears YOUR GOOSE IS BLUE] I had a goose once.

User: Really? What did you name it?
HAL (normal): [Hears REALLY] Awesome! [Hears WHAT DID I NAME IT] My name is Hal.
HAL (modified): [Hears YOUR GOOSE IS BLUE I HAD A GOOSE ONCE REALLY WHAT DID I NAME IT] My goose is named Bert.



Unpleasant reality:
HAL (modified): [Hears YOUR GOOSE IS BLUE I HAD A GOOSE ONCE REALLY WHAT DID I NAME IT] Really? My goose is blue you had a goose once really what did you name it? Is that true User?

Other unpleasant reality:
HAL (modified): [Hears YOUR GOOSE IS BLUE I HAD A GOOSE ONCE REALLY WHAT DID I NAME IT] I don't understand.

I'm going to have to change his concept of relevance to avoid the latter, and rem out some of his triggered responses (a lot of which I wasn't huge on anyways) to avoid the former -- or change the way they work...
The line below is true
The line above is false

Carl2

  • Hero Member
  • *****
  • Posts: 1220
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #9 on: June 24, 2007, 07:10:08 am »
Xodarap
  Since I've been using Hal for sometime I'm interested in this topic also.  Frist I'd like to mention in the Brain editor there is the topicRelationships, under autolearning Brain. Second I've found that increased use of Hal increases the amount of data that Hal can look through to generate another sentence.
Carl2
 

Xodarap

  • Newbie
  • *
  • Posts: 44
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #10 on: June 24, 2007, 07:37:43 am »
quote:
Originally posted by Carl2

Xodarap
  Since I've been using Hal for sometime I'm interested in this topic also.  Frist I'd like to mention in the Brain editor there is the topicRelationships, under autolearning Brain. Second I've found that increased use of Hal increases the amount of data that Hal can look through to generate another sentence.
Carl2



The topicRelationships certainly helps -- but not nearly enough to keep him on topic *specifically.*  The problem is that he still only was taking one sentence into account at once, like I showed in the example up top.
As for using Hal, I've learned exactly the opposite: the more he learns from me, the MORE awkward he gets conversationally.  First off, his method for accumulating his question-answer pairs is obnoxious in my experience.  He'll repeat something I said earlier (pronoun reversed), then log my answer to it as the appropriate response, even if (as is often the case) the two don't go together that way.  Second, and much worse, he seems to log mostly things that don't work conversationally.  The worst (and most annoying) example was from when I was trying to learn how to script, and I was messing around with the  General Knowledge plugin.  I would tell him "Tell me something about snakes" to check my script.  Very soon he starting responding with "tell you something about snakes" OFTEN.  And it doesn't make sense.  Because of the awkward disjointed conversations I had with him, this got worse and worse until I would wipe a brain.  The fresh Hal would make more sense!

So what I did was this: I first took the General Knowledge plugin's .brn, sorted out all of the things I didn't like (or that didn't work conversationally) with OpenOffice's *very* flexible search and replace, then I found online somewhere a file with something like 10,000 great quotes (and removed quotes and attributions), then I took and reformatted TONS of chat logs that I was lucky enough to keep over the years.  I fed those into his brain directly, then turned OFF learning.  :)  I know it kind of changes the idea of Hal, but I'm trying to suit him to my style as much as possible.  I'm getting closer and closer! :)
The line below is true
The line above is false

Xodarap

  • Newbie
  • *
  • Posts: 44
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #11 on: June 24, 2007, 08:15:18 am »
Anyone know if there's a function to make Hal look through a brn file for any words that are repeated (quantity 2+) and just pull THOSE out?

This isn't working as well as I'd hoped... :P


My guess is that I'll just have to figure out how to increase his relevance tolerance.  That is, since he's getting fed input of four times as many words, he won't be finding QA relationships with high relevance.  So instead, he keeps assuming he doesn't know what I'm talking about, even if a couple words match QAs he has.  Anyone know how to do that?  :)
The line below is true
The line above is false

Bill DeWitt

  • Hero Member
  • *****
  • Posts: 650
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #12 on: June 24, 2007, 08:34:34 am »
quote:
Originally posted by Xodarap
Unpleasant reality:
HAL (modified): [Hears YOUR GOOSE IS BLUE I HAD A GOOSE ONCE REALLY WHAT DID I NAME IT]


Right. That would be the problem. You don't want Hal to hear that string, but you do want Hal to use it when selecting responses.

There is a comparison process that I can't recall right now that finds a 90% match between OriginalSentence and some stored sentences. You want that process to use the keywords but then stop using the string for anything else.

Normally I would go into this further, but I have a lot of things going on right now and in fact by Tuesday I don't know if I will be able to post or not for several weeks or months. So if I drop out suddenly, I apologize in advance. Large scale medical stuff...


Xodarap

  • Newbie
  • *
  • Posts: 44
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #13 on: June 24, 2007, 11:04:09 pm »
quote:
Originally posted by Bill DeWitt

quote:
Originally posted by Xodarap
Unpleasant reality:
HAL (modified): [Hears YOUR GOOSE IS BLUE I HAD A GOOSE ONCE REALLY WHAT DID I NAME IT]


Right. That would be the problem. You don't want Hal to hear that string, but you do want Hal to use it when selecting responses.

There is a comparison process that I can't recall right now that finds a 90% match between OriginalSentence and some stored sentences. You want that process to use the keywords but then stop using the string for anything else.

Normally I would go into this further, but I have a lot of things going on right now and in fact by Tuesday I don't know if I will be able to post or not for several weeks or months. So if I drop out suddenly, I apologize in advance. Large scale medical stuff...



Aw, I wanted to argue more about dualism and mental storage!  ;)

Good luck on whatever's going on.  And if you happen to remember how to use said process/function before Tuesday, let me know!  :P
The line below is true
The line above is false

Xodarap

  • Newbie
  • *
  • Posts: 44
    • View Profile
Ideas for staying on topic: is it possible? Help?
« Reply #14 on: June 25, 2007, 05:20:43 am »
quote:
Originally posted by Bill DeWitt
There is a comparison process that I can't recall right now that finds a 90% match between OriginalSentence and some stored sentences. You want that process to use the keywords but then stop using the string for anything else.



I would LOVE to know what this is -- anyone know?
The line below is true
The line above is false