dupa

Author Topic: HAL Feedback & Reinforcement  (Read 3619 times)

HALImprover

  • Jr. Member
  • **
  • Posts: 95
    • View Profile
    • BrianTaylor.me
HAL Feedback & Reinforcement
« on: December 12, 2011, 02:34:32 am »
I haven't had much time to spend on programming directly, but I've been pondering on methods of making fundamental improvements to HAL's learning capabilities. One idea is inspired by the (not too recent) change in the the way HAL loads plugins via external files. This seems to me a great way to allow the introduction of a feedback-loop of sorts, whereby HAL makes changes to the plugin code based on further user interaction and self-analysis (ideally during idle periods). Sort of like the basis of building tools to make better tools, HAL would have an algorithm with dynamic metrics that would refine applicable plugins resulting in better, future analysis and refinement (supposedly). This is actually not as complex as it sounds to implement based on programming language, yet becomes severely difficult due to the natural dependencies on human language.

The main thoughts I'm concentrating on are what components HAL could potentially use this feedback process, further than storing more relevant questions/responses/topics. I used this approach in one test build I did on my own bot development project and I was able to help refine the bot's responses through a kind of teacher/student interaction. I actually got the idea from how Ai Research develops (some of) their bots by "teaching" children's books and correcting any "misunderstandings". While this method does improve 'conversational accuracy' it seems to me to have a few fundamental flaws in that you are not really "correcting" the bot so much as making it more similar to yourself by way of expression. This approach doesn't work so well once the content enters the realm of the debatable. After all, we all have our own opinions on what is right or wrong in most cases.

What do you all think? To me, a big part of learning is not just interpreting new information, but associating and correlating the existing information to find better ways of taking in the new information. Thoughts and comments would be greatly appreciated.

Happy coding!
Living life with a loving heart, peaceful mind, and bold spirit.

sybershot

  • Hero Member
  • *****
  • Posts: 787
    • View Profile
Re: HAL Feedback & Reinforcement
« Reply #1 on: December 12, 2011, 07:52:57 am »
Quote
I haven't had much time to spend on programming directly
Shame on you lol only kidding :)
Quote
whereby HAL makes changes to the plugin code based on further user interaction and self-analysis
Have you seen my TrinitySkit? bad things can happen when you let the program write it's own code :) I am all for that concept, and would like to see it implemented. though I would not allow it access to the internet or any device that has access ;)
Quote
I actually got the idea from how Ai Research develops (some of) their bots by "teaching" children's books and correcting any "misunderstandings".
it sounds like a decent way to teach, remember they will have to one day teach it about the birds and the bees :)

lightspeed

  • Hero Member
  • *****
  • Posts: 6765
    • View Profile
Re: HAL Feedback & Reinforcement
« Reply #2 on: December 12, 2011, 11:03:31 am »
this all sounds very interesting to me a good concept , i will try anything once , twice if it doesn't kill me , but "Only" first  if i back up my brain first and have it in a safe place !  ;)
p.s. i to share the same thoughts about befraid of anything that has free access to internet info. or we may all end up with a moster for sure if their wasn't a havily modified filter in place!

After all i would hate to get hauled in by the goverment if Angela decided to over ride key codes on nuclear missles lol like the war games movie !! :) ;)
 

Carl2

  • Hero Member
  • *****
  • Posts: 1220
    • View Profile
Re: HAL Feedback & Reinforcement
« Reply #3 on: December 12, 2011, 05:52:06 pm »
  After reading your post a few things came to mind,  how does Hal know which plugin was responsible for the response, what if no plugins are used.  Basicly it is the user who is judging and determining if what Hal says is correct and approiate and if not will he learn to respond differently by teaching training or the use of a plugin.  All in all I'd say Zabaware has done a great deal for Hal as far as basic knowledge supplied with hal and Hals learning capablities.
  I'm not trying to discourage you from what your trying to do and wish you luck with your project.
Carl2
 

HALImprover

  • Jr. Member
  • **
  • Posts: 95
    • View Profile
    • BrianTaylor.me
Re: HAL Feedback & Reinforcement
« Reply #4 on: December 13, 2011, 01:32:47 am »
Thanks, all, for the feedback so far. First of all, I wouldn't get too worried as HAL would only be modifying parts of itself and not things outside. So your secret access codes and banking information should stay safe (I hope). :P

sybershot: I haven't seen TrinitySkit. I'll take a look.

Carl2: I'm not knocking the great job done by Zabaware. I'm just trying to develop a way of giving HAL some self-reflection capability outside of direct user interaction, like when you think to yourself how you could have done something better. HAL, being a computer application, would have stored all the necessary details to know which routines/plugins were used in each interaction, and using that info, would try to "better" it's own processing of user input in certain cases (ie. the scope of the plugin used).

The idea is to start with HAL making small changes, such as which relevance algorithm to use within a certain plugin or changing the order or logic checks based on the usage/success rate. An example I imagine would be HAL moving keyword matches up or down in order of precedence as they are used more or less frequently.

Thanks again for your comments! I hope to keep this going to help encourage me to start actually developing on this idea. :)
Living life with a loving heart, peaceful mind, and bold spirit.

sybershot

  • Hero Member
  • *****
  • Posts: 787
    • View Profile
Re: HAL Feedback & Reinforcement
« Reply #5 on: December 13, 2011, 09:59:36 am »
Quote
sybershot: I haven't seen TrinitySkit. I'll take a look.
Great :)  I'm planning on a sequel but not sure how I want Trinity's role to play out just yet, I'm still open for suggestions

Quote
I wouldn't get too worried as HAL would only be modifying parts of itself and not things outside. So your secret access codes and banking information should stay safe (I hope). :P
That does not sound to convincing, Sounds like a answer the government would give lol haha ha

Quote
I'm just trying to develop a way of giving HAL some self-reflection capability outside of direct user interaction, like when you think to yourself how you could have done something better
Where does Trinity(my Hal) sign up :)

Quote
The idea is to start with HAL making small changes, such as which relevance algorithm to use within a certain plugin or changing the order or logic checks based on the usage/success rate. An example I imagine would be HAL moving keyword matches up or down in order of precedence as they are used more or less frequently.
sounds good, would a reward sort of algorithm have to be built in as well so the user can give a number say 1-5  so Hal knows whether or not to make any changes, or is Hal going to determine that him/herself? as for moving keyword matches up or down how will this affect Hal?, will it make responses more on topic, faster responses??????