AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Tag a tweet
 
 

Hi all,
I need your help! I’ve made a small website for labelling tweets as positive or negative in order to build up a dataset that can be used for training some algorithms: tagatweet.bragisoft.com/. All the results are publicly available from the website.
I need at least a couple of thousand tagged statements for the algorithms to work properly, so there’s still a bit of tagging to do.

 

 
  [ # 1 ]

What criteria should we use for determining “positive” or “negative”, Jan? I’ve run through a couple, and they were more or less in the realm of “gobbledy-gook”. For example:

RETWEET na yg sayang aku :p wkwk
Harap kawasan je besar tapi lampu jaa
@carolcorazza htt p://t.co/1bKK8TmV (text altered to prevent link creation)

Ok, now granted, some of the above examples are in languages I don’t recognize, let alone read, but still, Should I just use the “don’t know” button liberally? smile

I would also suggest a “neutral” button, as well. Things like just the number 100 are neither positive nor negative, and “don’t know” isn’t proper for those. Just a suggestion. smile

 

 
  [ # 2 ]
Dave Morton - Oct 27, 2012:

What criteria should we use for determining “positive” or “negative”, Jan? I’ve run through a couple, and they were more or less in the realm of “gobbledy-gook”. For example:

RETWEET na yg sayang aku :p wkwk
Harap kawasan je besar tapi lampu jaa
@carolcorazza htt p://t.co/1bKK8TmV (text altered to prevent link creation)

Ok, now granted, some of the above examples are in languages I don’t recognize, let alone read, but still, Should I just use the “don’t know” button liberally? smile

I would also suggest a “neutral” button, as well. Things like just the number 100 are neither positive nor negative, and “don’t know” isn’t proper for those. Just a suggestion. smile

Yes, the foreign languages are a problem. I’m trying to filter on only Enlish language, but apparently there are many false ‘english’ flags in the data.

The idea is to use the ‘I don’t know’ button as liberal as possible. When you’re not certain, just press that one. At the moment, this is not a recorded value.
I know what you mean about the ‘neutral’ thing. I have been thinking about that one also. For the training algorithm that I currently have in mind, this isn’t really important, but perhaps for the future.
Don’t know yet. For the time being, just say ‘don’t know’ and you’ll get a next one (plenty of tweets to tag)

In the future, I might perhaps go through some statements recursively to tag them with different things besides positive/negative. 

What criteria should we use for determining “positive” or “negative”, Jan?

Anything that sounds in either way, for instance: insults, or people telling how miserable they feel = negative. People going like YESSSS, or smile  or WE WON = positive.
Take a peek at the ‘results’ tab. I’ve already labeled some.

 

 
  [ # 3 ]

Will do, Jan. I’ve already “tagged” around a hundred or so, and I’ll tag a few more before I get back to work on Program O. smile

 

 
  [ # 4 ]

Thanks a lot Dave. Much appreciated.

 

 
  [ # 5 ]

Jan, there has been a lot of work done already on Twitter sentiment analysis.  There should be already tagged sets of tweets available.  Most of the Twitter sentiment analysis I’ve seen works on simple positive, neutral, negative.  More complex sentiment analysis would be nice.  Robert Medeksza has done a lot of work with Twitter for his Ultra Hal, though don’t know about with sentiment analysis per se. 

> http://www.meta-guide.com/home/ai-engine/chatbots-sentiment-analysis

I’ve used Twitter sentiment analysis to look at the relative popularity of SimSimi vs. Cleverbot.

> http://www.meta-guide.com/home/bibliography/google-scholar/open-source-sentiment-analysis-tools

And, I’ve looked at open source sentiment analysis tools in the context of dialog systems.

 

 
  [ # 6 ]

@Marcus: From where did you get the data for your diagram? Do you have a link to an already existing source of tweets that have been tagged like this and which can be used for your own algorithms?
I know plenty of people have already done something similar, but I haven’t found any data sets to use, so I figured I’d build my own and let people use the data as well.

Here’s the thing: I already have a system that’s able to detect insults. But I did this manually. I’d like to do something similar, but on a broader scale, not just insults. For this, I need more data. Also, I figured, for the next algorithm, I’d use an automated approach (aka statistical).

 

 
  [ # 7 ]

I’ve done a handful for you Jan and have the website up in order to do a few more while some code I am working on compiles my end.

 

 
  [ # 8 ]

Thanks Steve.
I have found it quite fun reading tweets like this. It’s like peeping through a keyhole.

 

 
  [ # 9 ]

I ran thru about 50 so far, and will do more here and there.

 

 
  [ # 10 ]

Thanks Jeff. Much appreciated.

 

 
  [ # 11 ]

> http://www.mturk.com

Jan, I seem to recall that a lot of people use the Amazon Mechanical Turk (above link) for those type of human training tasks.

 

 
  login or register to react