AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Free Stanford Courses on Machine Learning and NLP
 
 
  [ # 31 ]

I’ve just been watching http://www.cosmolearning.com/video-lectures/linear-regression-gradient-descent-normal-equations/ which is a video of Andrew Ng teaching the second lecture of the Machine Learning class in the Stanford classroom, and comparing it to the online videos from the ml-class.org site.

I think Ng is much more at home in the online class videos: in the Stanford in-person class video, he has to deal with moving chalkboards, his writing slopes down, the transition to viewing his laptop screen is not as smooth, he has to deal with chalkdust and erasers, he says “Um” and “ok?” a lot, and generally seems less in control of his presentation than in the online videos.

Just viewing the classroom video brings back all the anxiety and “white noise” I used to experience in class, which slowed down my learning potential :) I much prefer the online class videos.

 

 
  [ # 32 ]

Thanks for the links.  Here are some other links

http://academicearth.org/subjects/computer-science/category:14

http://academicearth.org/subjects/

Regards,

Robert

 

 
  [ # 33 ]

Today ends the Stanford NLP course. It was another insightful course (although this one had a more ‘Beta’ feel than the others). You can watch the class videos here: https://class.coursera.org/nlp/class/index

John L. Hennessy, Stanford University’s president, predicts the death of the lecture hall as university education moves online. You can read the interview here: http://spectrum.ieee.org/geek-life/profiles/john-l-hennessy-risk-taker

NLP couse stats:

“We’ll try to post more complete numbers in a future announcement, but here are some quick numbers: about 60,000 students pre-registered, 45,000 registered, 25,000 watched the first lecture, 6000 turned in PA1, 4000 turned in PA2, and about 6,000 seem to be watching the videos.”
Posted by Dan Jurafsky (Instructor)
on Wed 11 Apr 2012 4:15:12 PM PDT

It looks like about 1800 students will complete all the quizzes in the class.

Some quick stats from the first iteration of +Udacity and +Coursera

Intro to Artificial Intelligence by Sebastian Thrun and Peter Norvig [1]
Registered students: 160,000 [2]
Studens who cleared the course: 23,000 [3]
Students with perfect scores: 248 [2]

Machine Learning by Andrew Ng [4]
Registered students: 100,000 [5]
Students who received statement of accomplishment for the advanced track: 12195 [4]
Students who received statement of accomplishment the basic track: 1034 [4]

Introduction to Databases by Jennifer Widom [6]
Registered students: 91,734 [7]
Students who submitted atleast some work for grading: 25859 [7]
Students who received a statement of accomplishment: 6513 [7]
Students who gave the final exams: 7108 [7]

Detailed statistics for DB class available at [7]

#coursera #stanford #udacity

Sources:
[1] https://www.ai-class.com/
[2] http://blogs.reuters.com/felix-salmon/2012/01/23/udacity-and-the-future-of-online-universities/
[3] https://plus.google.com/b/107809899089663019971/107809899089663019971/posts/ipuBzuy5o9h
[4] http://www.ml-class.org/
[5] http://blogs.reuters.com/felix-salmon/2012/01/31/udacitys-model/
[6] http://www.db-class.org/
[7] https://plus.google.com/b/107809899089663019971/107809899089663019971/posts/5NG7sXNjapV

 

 
  [ # 34 ]

I weeded myself out after Programming Assignment 6, implementing the CKY algorithm. After spending so much time on previous assignments, I was tired of reinventing wheels, and in a format I couldn’t easily use as a tool for my purposes. Also, Manning’s long videos put me to sleep!

The revolutionary aspect of the free, online learning paradigm is that I can go back when I’m ready and continue at my own pace. I didn’t take ml-class the first time, but completed over half the class on my own, getting feedback (but no credit) for the review questions and programming assignments. Now I’m taking the second iteration of the class with everyone, so the first six or seven weeks are review for me…

I got a lot of good knowledge out of the first five or six weeks of nlp-class. I found the programming assignments to be too rigidly fitted to their grading purposes, so I took a fair amount of time to retool the Naive Bayes sentiment classifier, for example, into something that I could interact with. I found myself wanting to talk to the programs as I would to a chatbot; I wanted to submit reviews live to the sentiment classifier and have it give me immediate feedback at runtime, instead of executing a command-line program that went through the training procedure each time…

http://subbot.org/nlp-class/pa3/readme.txt (code at subbot.org/nlp-class/pa3) is a write-up of what I did with the sentiment classifier. I plan to turn it into an agent and add it to my system.

Another example: when the parsers got something wrong, I wanted to tell them, at runtime, what they got wrong; give them rules that they could learn on the spot…I’m still trying to figure out how to do that with the Stanford Lexparser.

Coming from a linguistics background, I find my instincts controvert much of the statistical NLP approach. Manning, in http://see.stanford.edu/materials/ainlpcs224n/transcripts/NaturalLanguageProcessing-Lecture03.html , says:

The other half of the story is language models work really well if you’ve got a ton of data. That may not be rocket science, it’s true, but you know, this is kind of in some sense the slightly sad part of modern NLP, right? That it turns out that for a lot of things, you can try and do really clever things on a small amount of data, or else you can do dumb things on a very large amount of data. And, well, normally during dumb things on a large amount of data will work just as well.

Now of course you can have the best of both words and you can do clever things on large amounts of data and that will be better again, but you know, if you think about it abstractly, you can try and do a clever job at smoothing and you can do good things with that, but it’s still kind of having a poke in the dark.

If your other choice is to go and collect the 100 times more data and then just have a much better sense of which words occur after large with what probability, that that empirical data is just going to be better.

The problem is, I still run into instances where the statistical NLP approach may produce a 95% (or whatever) score when it’s tested on a standard corpus from the Wall Street Journal; but when I try something on it, it fails. In those cases, I want to teach it right away, as I would a kid. I want to use online, or active, learning; but the programming assignments for nlp-class didn’t teach that.

The other problem I had with nlp-class is the honor code. I’m used to asking questions on the internet, and if someone knows the answer, they help me out. But the honor code makes that desire to share and help others dishonorable. So on the class discussion forums you had a lot of beating around the bush, people who knew the answer bending over backwards not to say it explicitly. I think the honor code teaches that hoarding information is a good thing, and I think that’s bad for society. Science progresses fastest when information is free and open to all.

Instead of testing us on material that they already know the answer to, and keeping their fists closed around the knowledge needed to solve it, why not assign problems they don’t know how to do, and have us work collaboratively to try to improve the state of the art?

Anyway I’m taking the Logic class now, and the discussion forums there seem much less concerned with the honor code. The explicitly stated goal of the class is to get us to finish it.

So I’ve had time to work on my logicagent (subbot.org/logicagent), to make it do some of the exercises…for example:

http://subbot.org/intrologic/golfers/README.txt
http://subbot.org/intrologic/relational/jackandjill.html

 

 

 < 1 2 3
3 of 3
 
  login or register to react