AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Free Stanford Courses on Machine Learning and NLP
 
 
  [ # 16 ]

The AI class for those who missed it. Videos and some subtitles in an easier to access format. Just select the topics you are interested in.
http://www.wonderwhy-er.com/ai-class/

 

 
  [ # 17 ]

Dave, sorry to hear that.  I only had time to take one course - the machine learning one.  It was very interesting blowing the dust off my knowledge of vector mathematics and the like.

Merlin - holy cow… I’m jealous of how much time you have!  Between my day job, my own project & life in general, I could barely get all those assignments completed… fell behind one or two and got ‘dinged’ 10% on the marks even though the answers were all correct (late penalty).

I’m taking, next month, the computer security course.  I’m very interested in source code review and ‘pen testing’.  If I had the time I would take many more of them, the probability one and SaaS are tempting me!  Having learned of the existence of these free online courses was great!

 

 
  [ # 18 ]

I completed the machine learning course and started the AI course, but left the AI course due to lack of time.

The machine learning course IMHO was superb and is highly recommended to anyone interested in this topic.
Nothing that could directly be used in a chatbot though (unless you want your bot to perform spam
(e.g. insult) detection smile ).

Some of the techniques discussed during the course, like support vector machines, appear to have applications in natural language processing, but we didn’t discuss those applications. I expect (or rather, hope) the NLP course (my next endeavour) will touch upon chat bots (conversational agents)  given the hype that surrounds SIRI at the moment .

 

 
  [ # 19 ]

I took a quick look at the NLP course, doesn’t look too promising.  Seems like the same song and dance, try to parse this input and create a SQL query, or convert to first order logic.  Perhaps useful in some narrow way, but I doubt if that will bring us full, free form & true NLU.  Correct me if I’m wrong, perhaps I looked too quickly, but it didn’t seem like it was focused on handling extreme linguistic ambiguities that we have to deal with in the real world.

 

 
  [ # 20 ]

I tend to agree Victor. So far none of the courses deal with the full breadth of things we need for a chatbot. But, from the list of topics in his past courses, some of the topics may prove interesting.

http://see.stanford.edu/see/lecturelist.aspx?coll=63480b48-8819-4efd-8412-263f1a472f5a

 

 
  [ # 21 ]

Oh no doubt there is something to be gained.  I guess I’m trying to prioritize…  because I certainly don’t have time for all of them!  I think for the NLP, I’ll just watch the videos, and not the exercises.

 

 
  [ # 22 ]

I liked using programs to solve as many of the quizzes, homework and exam questions as I could in the ai-class. The AIMA code repository was useful but I had to debug some of the programs there. I also used NLTK (http://www.nltk.org) to do some of the quizzes in the two NLP units. (But i’ve been running into bugs in the nltk too, in their NgramModel class for example.)

I’ve been thinking how I could use probabilistic techniques to segment mindpixels such as:

Is it hot during the summer?
Is ice cream cold?
Was George Washington a president of the United States?
Is 3 times 9 equal to 27?

The Stanford Parser (which uses a Probabilistic Context-Free Grammar) actually does pretty well segmenting the first three of these examples. I would just have to write a scraper that looked for the constituents on two separate lines that had the same level of indentation in the parse tree.

Note: The online parser demo (linked above) must be using the “englishFactored.ser.gz” lexicalized PCFG, which is included in the download, but requires more memory allocated to the java heap at startup than does the “englishPCFG.ser.gz” grammar…

 

 
  [ # 23 ]

Thanks for the updates! Really useful!

About this bit:

Robert Mitchell - Jan 6, 2012:

Is it hot during the summer?
Is ice cream cold?
Was George Washington a president of the United States?
Is 3 times 9 equal to 27?

 

The first two questions are subjective, it depends on how experience hot and cold and in which context the question was raised.

The question of George Washington and 3x3 are more factual lookups aren’t they?

As you present those examples as more or less the same sort of phrase, what’s the similarity in those four phrases?

 

 
  [ # 24 ]
Erwin Van Lun - Jan 6, 2012:

Is it hot during the summer?
Is ice cream cold?
Was George Washington a president of the United States?
Is 3 times 9 equal to 27?

The first two questions are subjective, it depends on how experience hot and cold and in which context the question was raised.

I think the point was more—how do those input strings get parsed to deduce what the user is asking, and not so much caring about the “correct” answer.

If the input question is subjective, the answer can be subjective… as they say, “garbage in, garbage out”.. or ‘ask a stupid question, get a stupid answer’ smile

Actually, just by the fact that the user specified such a general , broad, and very non-detailed question, implies they would probably be happy with a very general answer.  If not, the bot should “educate” the user (that is, give them a list of “parameters” they need to specify for a more useful answer).  Perhaps the bot could have two modes “casual conversation” or “scientific mode” !

More importantly than the “correct” answer is understanding what the input question really means.

However, I do agree, that the system should have the functionality to deduce *that* the input was subjective.  But again, what is the VITAL thing here? UNDERSTANDING, yes.  The system must determine that the input is, first of all , of course, a question, and secondly that it is a subjective question.  (‘Hot in the summer, well, depends, how many degrees C or F do you consider ‘hot’?), or be funny (‘actually, compared to the core temperature of the sun, it is extremely cold in the summer), or “general subjective input , general subjective output”, ‘is it hot in the summer’, ‘yes, a lot hotter than the winter’.  These answers aren’t wrong, if they are, the input question is ‘wrong’.

The system should also understanding *dependencies*.  Hot in the summer.. well, even here in Canada (believe it or not), it -is- ‘hot’, but further south, it is even hotter.  Thus, the system, knowing that “location” is a dependency of answering ‘is it hot in the summer’, should ask a clarifying question. 

input: [is it hot in summer]

use knowledge base : deduce that “location” is dependency.

return: “Depends, where are you?” , response ‘Canada’

answer - The average summer temperature in Canada varies between 25 and 30 C.
or, perhaps if user entered Mexico, then answer “Average between 35 and 40 C”

This ability,  to determine what dependencies are there, and asking to clarify, will be the next generation of search engines.  Right now you enter you string into Google and cross your fingers, there is no such functionality—except of course the cool misspell feature they have “Did you mean _____” which I think is great, but a ‘far cry’ from full blown NLU.

 

 

 

 
  [ # 25 ]

The problem I’m trying to address is how to split Mindpixel sentences such as:

Is ice cream cold?
Do airplanes fly?
was abraham lincoln once president of the united states?
do people have emotions?
Does the human species have a male and a female gender?
Do most cars have doors?
Was the Great Wall of China built by humans?

When we read these sentences, we know that “ice cream” forms a constituent, and “cold” another constituent; “abraham lincoln” goes together, “the human species”, “most cars”, etc.

It’s hard to capture our ability to split the sentences with a program. Regexes is one way, but doesn’t work well for the examples above.

I wanted to see if the Stanford Lexparser could reliably find groupings in the Mindpixel sentences. So I made lexagent:

—-

> split: is ice cream cold?
subject = ice cream; predicate = cold.

> split: Do airplanes fly?
subject = airplanes; predicate = fly.

> split: was abraham lincoln once president of the united states?
subject = abraham lincoln; predicate = once president of the united states.

> split: do people have emotions?
subject = people; predicate = have emotions.

> split: does the human species have a male and a female gender?
subject = the human species; predicate = have a male and a female gender.

> split: Do most cars have doors?
subject = most cars; predicate = have doors.

> split: Was the Great Wall of China built by humans?
subject = the Great Wall; predicate = of China built by humans.

> split: Was the Great Wall built by humans?
subject = the Great Wall; predicate = built by humans.

—-

So it works for all but the last example. The problem with the last example is that the Lexparser’s parse tree is more complex than for the other examples:

—-

> parse: Was the Great Wall of China built by humans?
  (FRAG (: -)
  (VP (VBD Was)
    (NP
      (NP (DT the) (NNP Great) (NNP Wall))
      (PP (IN of)
      (NP (NNP China))))
    (PP (VBN built)
      (PP (IN by)
      (NP (NNS humans)))))
  (. ?)))

—-

My algorithm for finding constituents looks for the same level of indentation on two successive lines, but in this case the “The Great Wall of China” covers four lines in the parse tree. So my algorithm needs to be amended…

The lexparser fails to parse the correct constituents in other cases, such as:

—-

> split: Is Bugs Bunny a cartoon character?
subject = Bugs; predicate = Bunny a cartoon character.

> parse: Is Bugs Bunny a cartoon character?
  (FRAG (: -)
  (VP (VBZ Is)
    (NP
      (NP (NNS Bugs))
      (PP (NNP Bunny)
      (NP (DT a) (NN cartoon) (NN character)))))
  (. ?)))

—-

Note: I’m using the lexicalized PCFG grammar. In the last case, the regular PCFG grammar parse correctly groups “Bugs” and “Bunny” together.

In conclusion, I think the probabilistic methods taught in the AI class are another tool in the agent toolbox. Where they fail, another agent should be used.

 

 
  [ # 26 ]

@Steve, Victor, Dave: sorry for the inconvenience, just deleted Rheas account.

@Victor: The End of the Captcha is near:
http://www.chatbots.org/research/news/the_unintended_captcha_hack_algorithm_of_google/

 

 
  [ # 27 ]
Erwin Van Lun - Jan 24, 2012:

@Victor: The End of the Captcha is near:

Yes…. I read that posting awhile back….and completely AGREE!  AI can help spammers… the never ending “arms race’ !!!

 

 
  [ # 28 ]

This may cause Sanford to rethink their free courses concept.
Sebastian Thrun Resigns from Stanford to Launch Udacity
http://www.i-programmer.info/news/150-training-a-education/3658-sebastian-thrun-resigns-from-stanford-to-launch-udacity.html

Thrun was one of the teachers of the Introduction to AI course.
Enrollment in the traditional classroom course was down by 85% versus other semesters with students opting for the on-line course instead.

This could mark the beginning of a sea change for university education.

 

 

 
  [ # 29 ]

I’d have to agree to that. Though not completely.
In Belgium, one university (the biggest) tried to force the students into taking an online version of the course cause there were to many students, they wouldn’t fit into the lecturing halls (there were something like 600+ pupils per class). This resulted in a general student uprising which forced the school to re-examine the project. So eventually, they set up a video conferencing system so that they can spread the students over many rooms.
Now the interesting thing about this: the rise in students was mainly caused because the school costs in the Netherlands went up, so students came to Belgian universities (university is almost free). These were foreigners. you’d think that they liked the idea of doing this from home.

I had seen the ‘udacity’ thing also yesterday or so and had been wondering ‘why’. What’s the compelling reason for this person to move from a university to an upstart company? I doubt it is for the teaching.

 

 
  [ # 30 ]

I do believe Sebastian was moved by the overwhelming response to the course. The professors had Q&A sessions where you could get a sense of how they felt about the course. They also participated in a round-table discussion about the future of on-line education with the Salman Khan.

The Khan Academy is a not-for-profit educational organization, created in 2006 by Bangladeshi American educator Salman Khan, a graduate of MIT. With the stated mission of “providing a high quality education to anyone, anywhere”, the website supplies a free online collection of more than 2,900 micro lectures via video tutorials stored on YouTube teaching Mathematics, History, Healthcare & Medicine, Finance, Physics, Chemistry, Biology, Astronomy, Economics, Cosmology and Computer Science.

http://www.ted.com/talks/salman_khan_let_s_use_video_to_reinvent_education.html

Sebastian is a smart guy and this moment was probably the best time for him to start a new company. He got a lot of press for his Stamford class and the work his team did on the robotic car. It was time to, ‘strike while the iron is hot.’

On a side note, I had signed up for the Stanford Natural Language Processing course and for some undisclosed reason this course was put on hold:

We hope you are as excited as we are for the forthcoming launch of Natural Language Processing! Unfortunately, we’ve just learned that the launch of the NLP class (and all our other online courses here) has been delayed for reasons beyond our control. We aren’t sure how long the delay will be - the range is from a few days to a week or possibly several weeks. We’ll let you know a firm date as soon as we possibly can.


I wonder if the 2 events are related.

 

 

 < 1 2 3 > 
2 of 3
 
  login or register to react