Monday, August 16, 2010

MPBoost++

I have finally released the C++ implementation of MPBoost, a simple, yet effective and efficient variant of the AdaBoost.MH algorithm. All the details in the dedicated page.

Tuesday, June 8, 2010

SentiWordNet 3.0

logoswnlong.png

Presented at LREC 2010, the new version of SentiWordNet is available for search and download on the SentiWordNet website.

SentiWordNet 3 is generated using an improved algorithm with respect to the previous SentiWordNet release and it is also aligned to the latest Princeton WordNet release (3.0).

A novel relevant feature of the website is the possibility to provide feedback on the values assigned to synsets.

SentiWordNet user feedback interface

This feature is a first step towards building a community of SentiWordNet users that collaboratively improve SentiWordNet.

Any feedback collected from the community will be shared to the public domain through the SentiWordNet site, i.e., what comes from the community goes back to the community.

Sunday, August 9, 2009

Quick user input with Console.ReadKey()

One of the many useful applications of the Console.ReadKey() method of the .net framework is to get a quick input from the user.
By using this method the user is not required to hit the return key to submit the input.
[Read More…]

Friday, June 5, 2009

How to Build a 100-Million-Image Database

arXiv blog has picked our article describing the CoPhIR collection for a short review.

Tuesday, April 21, 2009

Google similar image search

The future is closer than I thought.

Google has recently published in its lab section a similar image search service.

No details on recipe to build such system have been released, but from the first impression my guess is that some of the ingredients are SIFT, textual context, and inverted lists (in the sense that they are used to index visual similarity properties).

Google labs site indicates Chuck/Charles Rosenberg as one of the engineers that worked on similar image search, his Google bibliography has an interesting article (by Rosenberg, Ting Liu and Henry A. Rowley), Clustering Billions of Images with Large Scale Nearest Neighbor Search, which probably describes another ingredient of the recipe (large scale image clustering).

Two other interesting, and probably related, papers from Googlers (Shumeet Baluja and Yushi Jing) are PageRank for Product Image Search, and VisualRank: Applying PageRank to Large-Scale Image Search.

What’s the future for MiPai? Become the ingredient of a better recipe!

Why SIFT?

Look at the following search results for the Bill Gates’ mugshot:

gates.png

The last results show a number partial matches of a cropped, scaled, and moved parts of the original image. This is a kind of match that SIFT is very good at spotting it, while other typical similarity measures (e.g. MPEG-7 visual descriptor) are not able to spot because they handle the whole image as a single entity.

Why textual context?

Look at the following search results for the St. Louis Gateway of the West, which has been previously retrieved by searching for the word “arch”:

arch.png

The last result show a completely non-similar image, but its source page contains the word “arch”.

Quote

I think the bubble sort would be the wrong way to go.

(Barack Obama, "Job" interview with Google)

Latest Tweet

Loading the latest tweet...

Advertising