AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

[Experimental] Compile RiveScript to Binary
 
 

I’m working on an experimental new feature for the Python version of RiveScript: compiling your source files to binary for better performance!

It uses Google’s Protocol Buffers under the hood. When the Python RiveScript module is loading your RS source files from disk, instead of loading it directly into memory it will build up a Protocol Buffer and save it to disk at the end as a .rivec file. These work just like Python’s .pyc files do: in the future when your bot loads your RiveScript replies on disk, it will load them directly from the compiled binary .rivec file instead of parsing your code again. But, if you make a change to your source file, RiveScript will re-compile it.

This brings the benefit of having your replies able to be loaded from disk much more quickly. The binary format is very close to the internal format RiveScript uses when it loads the replies into memory, so it’s a very straightforward process for it to load the binary version.

I did some benchmarks to compare the speed difference. I loaded the A.L.I.C.E. reply set (AliceRS-0.03.tar.gz from http://www.rivescript.com/files/sets/ ).

On the old version of Python RiveScript, it took about 40 seconds to load A.L.I.C.E.‘s replies from disk. With the new version, on the first load it takes longer (~78 seconds) because it’s busy compiling them down to binary, but on subsequent loads it only takes 25 seconds!

If you were working on a bot the size of A.L.I.C.E., the increased loading speed would be really helpful. And as you’re actively developing your source files, RiveScript would only need to re-compile the files you changed, instead of having to rebuild ALL the files.

Another nice side benefit is that it made the whole parsing code a bit simpler for me. Since it’s not directly loading your plain text RiveScript documents into memory, there’s less need for juggling around temporary variables (i.e. dealing with the %Previous stuff), instead the parsing code just focuses on turning your plain text into a structured binary format that accurately represents the original text. Once that’s all organized, it’s then pretty straightforward (and simpler) to get it into memory for the bot to actually use it for chatting with people!

If you wanna play around with it, it’s on the protobuf branch on the GitHub repo: https://github.com/kirsle/rivescript-python/tree/protobuf

For a zip file download click the “Download ZIP” on that page.

Again, this is experimental. I may not end up going with this in the end (the Protocol Buffers stuff adds a penalty during the initial loading phase, and if you prefer NOT to compile binary RiveScript files (by passing build=False to the RiveScript module), you’d have to deal with that slowness every time you start the bot.

But feel free to play around with it and give feedback. smile

Protocol Buffer specification for `*.rivec` files: https://github.com/kirsle/rivescript-python/blob/protobuf/rivescript.proto

 

 
  login or register to react