AI Zone Admin Forum Add your forum
New syntax overhaul for RiveScript?
 
 

I’ve spent much more time writing RiveScript than I have writing in RiveScript, and I’ve been thinking for a while about overhauling what the syntax looks like for writing the code. For example, some of the limitations in RiveScript 2.0’s current syntax are:

     
  • Space is kind of cramped. If you want a conditional that results in random replies, you have to write some ugly code to do it.
  •  
  • Conditionals can’t stack. If you want to check multiple variables, one condition needs to redirect to another trigger for the next check. You can be clever and use topics to “hide” the private trigger, but it’s messy.

Compare this syntax:

my name is *
* <
get name> == <formal> => I knowyou've\\s
   ^ told me your name before.
* <get name> == <bot name> => <set name=<formal>>{random}
   ^ Wow, we have the same name!|
   ^ What a coincidence, that'
s my name too!|
   ^ 
What are the odds that wed have the same name?
   ^ 
{/random}
- <set name=<formal>>Nice to meet you, <get name>! 

With what it could look like:

trig my name is *
    if (<
get name> == <formal>) {
        reply I know
you've told me your name before.
    }
    elsif (<get name> == <bot name>) {
        set name = <formal>
        random {
            Wow, we have the same name!
            What a coincidence, that'
s my name too!
            
What are the odds that wed have the same name?
        
}
    }
    
else {
        set name 
= <formal>
        
reply Nice to meet you, <get name>!
    

I’m open to thoughts/suggestions about this. The new syntax wouldn’t be backwards compatible (it’d be RS 3.0), but since RS is so trivial to parse, an upgrader program for old scripts would be easy to make. Here’s my full mock-up of the new syntax idea:

// RiveScript 3.0 syntax mockup

version 3.0

// Same syntax as RS 2.0
! array colors red green blue yellow
sub who's = who is
//'

// The new version of the > begin block
begin {
  
// New version of conditionals
  
if (<get name> == undefined{
    
// New way of setting vars, the <set> tag is removed
    
set topic newuser
  }

  
if (<bot mood> == normal{
    
// The reply commands stack, more than one reply = they concatenate.
    
reply <ok>
  
}
  elsif 
(<bot mood> == angry{
    
// No more difference between {tags} and <tags>, we use <tags> everywhere
    
reply <uppercase><ok></uppercase>
  
}
  elsif 
(<bot mood> == sad{
    reply 
<lowercase><ok></lowercase>
  
}
}

topic newuser {
  
// Maybe + will be an alias for trig? And - an alias for reply?
  // Triggers make "implied" blocks (no need for { and }). The block ends
  // when a new trigger is found or a parent block ends (a } is found).
  
trig *
    
set topic newuser_1
    reply Hello
I don't know you! What's your name?
}

topic newuser_1 {
  
// Wildcards are the same: _ for words, # for numbers, * for anything
  
trig my name is _
    set name 
= <formal>
    
set topic newuser_2
    reply Nice to meet you
how old are you?

  
// 'alias' is the new @.
  
trig _
    alias my name is 
<star>

  
trig *
    
reply I'm sorry, what was your name?
//'
}

topic newuser_2 {
  trig i am 
# years old
    
set age = <star>
    
set topic random
    reply Ok
I will remember you are <get ageyears old.

  
trig #
    
alias i am <staryears old

  trig 
*
    
reply I'm sorry, how old are you?
}

// The indentations are optional but recommended for style
trig hello bot
reply Hello human!

// aliases can stack with replies.
trig * or something
  reply Or something.
  alias <star>

// Random replies. To continue a line, end it with a \\ character.
// The random keyword stacks with reply too.
trig how are you
  random {
    I'
m greathow are you?
    
Goodyou?
    
I'm doing well, how about you?
    I am doing so well you \\
      wouldn'
t even believe it.
//'
  
}

// A reply stacking example. Replies are concatenated using space characters
// now (this may be a configurable option which may be dynamically changed)
trig tell me about linux
  
// Begin a reply.
  
reply Linux is a free operating system.

  
// Set some variables.
  
set topic linux

  
// Continue.
  
reply It was written by Linus Torvalds and first released in 1991. 
    One example of a distribution of Linux is

  
// A bit of random
  
random {
    Fedora
.
    
Ubuntu.
    
Gentoo.
    
Mandriva.
  
}

  
// Throw in a conditional.
  
if (<get name> == Linus{
    reply You have the same name 
as the creator of Linux!
  
}

// You can redefine global variables just as easily now too.
trig turn debugging (on|off)
  
// Only the botmaster!
  
if (<id> == <bot master>) {
    
if (<star> == on{
      
// Variables are lexically scoped to the current request.
      
var debug true
    }
    
else {
      
var debug false
    }

    
// Set the debug mode.
    
! global debug = <var debug>

    
reply Debugging has been turned <star>.
  
}
  
else {
    reply You are not the botmaster
I won't listen to you.
    //'
  
}

// Object example.
object md5_encode perl {
  my 
($rs, @args) = @_;

  use 
Digest::MD5 qw(md5_hex);
  
my $hash md5_hex(join(" ", @args));

  return 
$hash;
}

// Call the object example.
trig what is the md5 hash of *
  
// If the character(s) used to concatenate reply segments is configurable,
  // it could be overridden on a per-reply basis. Here we could set it to
  // be a blank string. BTW, quotes can be optionally used on all "set" like
  // commands, but are optional. They're handy for setting empty strings or
  // spaces as values though.
  
option concat ""

  
reply The MD5 hash of "<star>" is "
  call md5_encode <star>
  reply "
 

 
  [ # 1 ]

Noah,
your syntax is starting to look a lot like javascript. You might think about adopting some of its conventions.

 

 
  [ # 2 ]

What I’m going for is something that would be easy to parse (I may consider a Python-style pattern where indents are important and get rid of those curly braces altogether); one thing I wanna avoid is adding dependencies on language parsers and things, so something that’s simple to parse as plain text is ideal.

What JavaScript conventions do you mean?

The mockup for this new syntax is sort of an evolution of the 2.0 syntax… for example, triggers automatically terminating at the next trigger or end of a “block” (like a begin or topic section). And instead of blocks being declared like from “> begin” to “< begin”, using curly braces, and then the only other JavaScript-like syntax is with conditional checks.

 

 
  [ # 3 ]

begin= new function(){
  // New version of conditionals
  if (name == undefined) {
  // New way of setting vars, the <set> tag is removed
  topic = newuser;
  }

  if (bot_mood == “normal”) {
  // The reply commands stack, more than one reply = they concatenate.
  return “<ok>”
  }
  else if (bot_mood == “angry”) {
  // No more difference between {tags} and <tags>, we use <tags> everywhere
  return “<uppercase><ok></uppercase>”
  }
  else if (bot_mood == “sad”) {
  return “<lowercase><ok></lowercase>”
  }
}

random(
Fedora.
Ubuntu.
Gentoo.
Mandriva.
)

If you pick a Python or Javascript style, people who program in that language might find it easier.

CoffeeScript might also give you some ideas:
http://coffeescript.org/

 

 
  [ # 4 ]

Neat. I think I may be leaning more towards a Python style syntax. The main thing I’m a bit unsure about in my initial mockup is the inconsistency between when you need {} and when you don’t (i.e. never on triggers).

Another fun idea I had was to make this sort of thing possible:

#!/usr/bin/rivescript

! include /var/bot/brain/*.rs

trig hello bot
reply Hello human! 

So you could make your scripts “executable” in the manner of a shell script. Instead of running “$ rivescript /var/bot/brain” you just do “$ /var/bot/brain/bot.rs”

 

 
  [ # 5 ]

I thought the same thing. It does look a whole lot like JavaScript. Since JS is pretty much native on most devices these days, I don’t see the point in re-inventing the wheel. There are even compilers now to translate to other programming languages from JavaScript.

JS is not just for form validation any longer smile

 

 
  [ # 6 ]

Is this still your intention Noah?

As I’m just getting used to rs, I’m almost *not* hoping for this but that’s just my selfishness smile

My preference would be the python syntax of the options you mentioned

trig tell me about linux:
  // Begin a reply.
  reply:
    Linux is a free operating system.

    // Set some variables.
    set: topic = linux

 
    It was written by Linus Torvalds and first released in 1991.
    One example of a distribution of Linux is

    // A bit of random
    random:
      Fedora.
      Ubuntu.
      Gentoo.
      Mandriva.
   

  // Throw in a conditional.
    if <get name> == Linus:
      You have the same name as the creator of Linux!
   

 

 
  [ # 7 ]

Yeah I probably won’t get around to redoing RiveScript anytime in the near future. wink But if I do, there’ll definitely be a 100% accurate converter program to just upgrade old code to the new format.

 

 
  [ # 8 ]

Cheers Noah!

May I say,  I’ve found your python code is easier to pick up than python aiml.  Good job!

 

 
  [ # 9 ]

Noah,

You may have to rename RIVEscript to RINSEscript ...
Rendering Intelligence Very Easily ... R-I-V-E
Rendering Intelligence Not So Easily ... R-I-N-S-E

 

 

 


Hehehe! That was only a joke!  Don’t take offense.

 

 
  [ # 10 ]

Oops, ruined my own joke!  Edit above: “You many have” should be, “You may have”

All joking aside…

_______________________________________________________________________

Here is some RiveScript news

concerning syntax:


Library RiveScript Perl 1.22-1

is a Ubuntu 12.10 package

in Synaptic Package Manager.

 

 
  [ # 11 ]

I’ve been thinking some more about this lately (went to a talk about how to write your own interpreter at PyCon, and I may take a stab at doing this “properly” with a lexer/parser.. or not, who knows).

I’m thinking Python-style (but not Python directly, because I don’t want to tie RiveScript down to one programming language) is a good idea. It would also be mainly command-driven like Tcl with a few exceptions for multi-line commands. Some syntax ideas:

on hello bot:
   
say Hellohuman!

on my name is *:
   
set name = <star>
   
say Nice to meet you, <get name>!

on what is my name:
   if 
defined name:
      
say Your name is <get name>, seeker!
   else:
      
say I dont know what your name is.

on say something random:
   
say This
   random [
     
"sentence",     "message",
   
]
   say has a random
   random [
     
"couple of words",     "few things",
   
]
   say in it

I’m still playing around with some syntax ideas. The idea of quoting the parameters sounds cool, but it looks a bit ugly when you have a character like * at the end of a quoted string, like “my name is *”... and for readability, I may make colons optional after the commands, and optional at the ends of some lines, so:

onmy name is *
   
sayNice to meet you

Also, I like the idea of having run-time parser configuration settings (on a per-file basis). If you’re familiar with Perl, what I’m talking about are a bit like the pragmas (like strict and warnings)—when Perl code does “use strict” or “use warnings”, stricture and warnings are applied for all of the following code (within the same lexical scope, but this is usually done at a file level), and you can then do “no strict”/“no warnings” to suppress the pragmas for a bit of code, then turn them back on, etc.

I also like the idea of making RiveScript a proper “scripting language”... which on Unix systems means your “begin.rs” might begin with the line “#!/usr/bin/rivescript”, and there would be commands to include other RiveScript files. There’d also be a bit of “programming language” type features, like being able to print a message to standard output directly. So your begin.rs could print a few things to the terminal window (maybe some copyright lines about the bot being run?). I won’t go too overboard with all that though. wink I do have to keep in mind that RiveScript will sometimes be used to run untrusted code (for example my chatbot hosting service), so the capabilities of RiveScript itself as far as I/O should be limited and disable-able globally.

Lots of crazy ideas here, what do ya guys think? If I’m going to redo RiveScript, I want to make it significantly better than it currently is, and not to follow the overall concepts and limitations of an AIML 1.0 style bot.

 

 
  [ # 12 ]

Noah,
I have also been thinking additional features for my scripting language (JAIL).

Let me give you some feedback based on developing and using JAIL:
If you are going to adopt python like indentation, you may not need the “on” and “say”.

my name is *
    
Nice to meet you

If you are going to have “on” and “say”, then you don’t need the indents except for “tree” type responses.

onmy name is *
sayNice to meet you

You might consider symbols instead of “on” and “say”.

Your random structure is still too verbose. One of the issues for those of us building NLG systems is how to make the language easy and expressive enough to let non-programmers use it. I’ll make the same suggestion I gave to Dr. Wallace as he was building AIML 2.0, you should adopt JAIL’s random tags “[”, “|”, “]” (ok they may not technically be ‘tags’ but you get the picture).

Your example in JAIL:

/say something random/i,
"This [sentence|message] has a random [couple of words|few things] in it."

This makes it easy for anyone to see the intent of the response. It takes your 8-10 lines and compresses them down to 2, saving space along with the readability benefits.

 

 

 
  [ # 13 ]

Between friends…

Said politely and in a respectful manner: I friendly disagree

with a chatbot that does not keep data separate from code

because that is fundamental and has been since Elizabot.

 

 
  [ # 14 ]

On the Python-style indenting, what I was thinking was that the trigger line (“on”) is somewhat like the function definition line in Python, and that everything that follows the trigger are the details from it. This is a bit like how RiveScript works today: after a + line is spotted, every line that follows is assumed to be related to the trigger, until a new trigger is found OR something like a closing label is found (i.e. a “< topic” line).

I liked the word “on” because it’s less verbose than “trig” or “trigger” or “pattern”, and makes the trigger seem more like an event handler, like “onClick” or “onKeyDown”. A longer time ago when I first started thinking on this I figured having +/- as being aliases to on/say would be nice to make it look backwards compatible, but then it would make the code look inconsistent: some people would prefer the +/-, and some would prefer on/say. So I’d rather go with just one or the other, and the on/say sounds more like plain English to where you can just read the source code and have a pretty good idea what it does.

I do like your random syntax though and I may borrow that. But I think I’ll make it be (...|...) instead, with parenthesis, because this would make it more consistent with the alternations in triggers, like “on what is your (home|cell) phone [number]”... in triggers, square brackets mean the entire set of words is optional, whereas parenthesis means one of the options must be used. Since this would be “magic syntax” I’d have to make it escapeable somehow, maybe by putting a \ before the parenthesis in case the botmaster actually did intend to have literal parenthesis there. Or this may be a good use for those lexical pragmas I was talking about, where you can have a line in your reply section saying “no tags” and the following say’s will be literal strings until you set “use tags” again, or until the lexical scope of the reply runs out (i.e. at the next trigger).

I’m not sure who or what 8PLA is disagreeing with.

 

 
  [ # 15 ]

I choose the “[]” because I felt it would come up less in responses than “()” so I wouldn’t have to worry about escaping so much. I have found this compressed random style makes it much easier to add variations in your responses.

 

1 of 2
1
 
  login or register to react