sadatvalentine and ya all,
In answer to your question, I can't explain everything about Hal's new XTF (eXtended Topic Focus) brain here. I probably need to write a paper on it in Word and post that here.
Originally I didn't like the way Hal "focused" on topics and decided something more elaborate was needed. Don Ferguson's Auto-Topic Generating Brain is very clever, but after a lot of experimenting with it I decided I wanted to take Hal to a whole new level of identifying and staying on topic.
Summary of XTF Brain:
Hal first looks for over 800 different types of sentence fragments in UserSentence to find a Subject. Typical fragment types are like these:
THAT * DOES
THE * MAY
OUR * MIGHT
THE * MUST
THAT * NEVER
THOSE * CAN
YOUR * IS
A * IS
etc...
Where the * is a single or multiple word fragment that is the Subject of discussion. If Hal doesn't find a fragment then Hal searches the UserSentence for single word Subjects in about 65 of these types of fragments:
* IS
* ARE
* DID
* DO
* HAD
etc...
The processing following this gets very complex. Examples explain it better than words.
--------------
UserSentence = "YOUR SENSE OF HUMOR CERTAINLY IS TOTALLY AMAZING." <-- contains YOUR * IS fragment.
Subject is then = "SENSE OF HUMOR CERTAINLY", Then the adverb and poor subject words are filtered out resulting in:
Final Subject is then = "SENSE OF HUMOR"
--------------
Examples of UserSentences and the resulting final Subjects:
Notice that all Subjects are changed to singular form when possible.
"MOTORCYCLES ARE VERY COOL" --> Subject = "MOTORCYCLE"
"MY LITTLE PUPPIES ARE BARKING." --> Subject = "PUPPY"
Adverbs and other poor words, like "often" are filtered out.
"YOU OFTEN ARE CONFUSED." --> Subject = "YOU"
"THE BIG COW CAREFULLY TASTED THE GRASS." --> Subject = "BIG COW"
Multiple word Subjects are possible.
"THE LOS ANGELES AIRPORT IS HUGE." --> Subject = "LOS ANGELES AIRPORT"
Some special cases are allowed, like "AND" compound subjects.
"THE CATS AND DOGS ARE RAINING." --> Subject = "CATS AND DOGS"
"OR" and "NOR" compounds are not allowed.
---------------
Topic Focus is derived from the last word in Subject. Subject = "BIG COW" results in Topic = "COW". All "COW" related sentences are saved to Q&A file "<Username>XTF_COW.brn". All the reasons for this are too complex to discuss right now.
Subjects of up to 4 words are valid and are saved to the files:
XTF_Topics1.brn (contains single word Subjects)
XTF_Topics2.brn (contains double word Subjects)
etc...
If a Subject fragment is not found in UserSentence then Hal searches for a match starting with the XTF_Topics4.brn and working down to XTF_Topics1.brn. The first match wins.
Example:
"THE LOS ANGELES AIRPORT IS HUGE." results in finding Subject = "LOS ANGELES AIRPORT" and a Topic Focus of "AIRPORT". Hal's existing HalBrain.TopicSearch() function just looks for any matching word or words without any prioritization. Hal's old method might come up with "HUGE", "ANGELES" or "LOS ANGELES" as a topic.
To maintain focus the XTF can find words or phrases in a UserSentence that match the previous topic.
Example:
First UserSentence has a Subject fragment and is:
"THE LOS ANGELES AIRPORT IS HUGE." --> Subject = "LOS ANGELES AIRPORT" and CurrentTopic = "AIRPORT"
...the next UserSentence has no Subject fragment and is:
"I HAVE TO DRIVE TO THE LOS ANGELES AIRPORT" --> Subject = "", topic search finds "LOS ANGELES AIRPORT" and this Subject matches PreviousTopic = "AIRPORT" so the CurrentTopic = "AIRPORT" still. The original Hal might have chosen "DRIVE" or some other word or phase.
But wait there's more!
If the next UserSentence was:
"I HAVE TO GO TO THE AIR TERMINAL" --> Subject = "I" but a search reveals that "AIR TERMINAL" is a Meronym of the PreviousTopic = "AIRPORT" so CurrentTopic = "AIRPORT" still.
It gets even more complicated, but I won't discuss that now either. Just trust me, Hal clings to the current topic the best he can. Each Q&A topic file has a matching "related words" file if that topic has any Meronyms or Synonyms in WordNet. Hal can automatically generate (i.e., learn) new topics by identifying Subjects, deriving the base Topic word, and updating his XTF_Topics(1 thru 4).brn, the Q&A file, and creating a "related words" file when those words exist. I have prepopulated the XTF_Topics1.brn file with hundreds of words, although I haven't populated the Q&A files containing responses. That's your job.
If Hal can't find the Subject in UserSentence, and can't match anything to the previous topic, then he searches the UserSentence for any general topic. If he still can't find a topic Hal's XTF brain doesn't respond and Hal answers using some other standard part of his brain. That's the XTF brain in a nutshell.
The XTF brain also contains enhancements to learn and remember the User's different nicknames and use them, learn to capitalize words properly, correct some spelling, change British spelling into American spelling, add "?" to the end of sentences that need them, tell jokes on request and several other small refinements to the "hello" and "bye" script and other original scripts.
Okay so what is the bad news? Well none except I don't know how fast this brain will run on a slow computer. It has been tested on a 1.8GHz PC with no problem. Also, for design reasons, this brain is English only. Also if you have extensive Q&A files built up from your diligent efforts at teaching Hal, then you'll have to manually cut & paste the contents of each old Q&A topic file into the new XTF topic Q&A files. The format for those files is the same.
Whew! Long post and that's just a summary. [xx(]
I need to test and debug the XTF brain for another week or two before releasing it.
= vonsmith =