13-01-2011
2
191 words

Collection of data

As papers accumulate in my Mendeley, I thought it might be a good idea to start collection some data. It might save me from scraping the web in about a month. To do this I made myself a little database and a bookmarklet that automatically saves a selected text into my database. I’m doing 1-2 columns a day for a week now, so my collection is starting to get some shape.

Talking about databases, 2011 came with two nice surprises. Just before the end of 2010 I requested usage licenses for SentiWordNet and the WordNet Affect domain. I got both! SentiWordNet is and extension of WordNet with semantic polarity applied to it. WordNet Affect does something similar by defining all synsets with affective states whose valence depends on the semantic context. Both lexicons might be very useful during the next phase when they can be used as corpus or dictionary for the algorithms I like to test. I’ll be telling more about this later this week.

HALLO PIET