Collection of data
data collection
dictionary
wordnet
As papers accumulate in my Mendeley, I thought it might be a good idea to start collection some data. It might save me from scraping the web in about a month. To do this I made myself a little database and a bookmarklet that automatically saves a selected text into my database. I’m doing 1-2 columns a day for a week now, so my collection is starting to get some shape.


Talking about databases, 2011 came with two nice surprises. Just before the end of 2010 I requested usage licenses for SentiWordNet and the WordNet Affect domain. I got both! SentiWordNet is and extension of WordNet with semantic polarity applied to it. WordNet Affect does something similar by defining all synsets with affective states whose valence depends on the semantic context. Both lexicons might be very useful during the next phase when they can be used as corpus or dictionary for the algorithms I like to test. I’ll be telling more about this later this week.
HOME
ARCHIVE
TAGS
BOOKMARKS
13-01-2011
2
Comments & Trackbacks
[...] during the week in the forthcoming posts. Collection is an essential pre-analysis step (see: Collection of data) and therefore [...]
[...] I mentioned earlier, I’ve received a license to use SentiWordNet (SWN) for my project. SWN is a lexical resource [...]