Rényi Entropy and Free Energy


I want to keep telling you about information geometry… but I got sidetracked into thinking about something slightly different, thanks to some fascinating discussions here at the CQT.

There are a lot of people interested in entropy here, so some of us ? Oscar Dahlsten, Mile Gu, Elisabeth Rieper, Wonmin Son and me ? decided to start meeting more or less regularly. I call it the Entropy Club. I’m learning a lot of wonderful things, and I hope to tell you about them someday. But for now, here’s a little idea I came up with, triggered by our conversations:

? John Baez, Rnyi entropy and free energy.

In 1960, Alfred Rnyi defined a generalization of the usual Shannon entropy that depends on a parameter. If $latex p$ is a probability distribution on a finite set, its Rnyi entropy of order $latex beta$ is defined to be

View original post 685 more words

Kernels Part 1: What is an RBF Kernel? Really?

My first blog on machine learning is to discuss a pet peeve I have about working in the industry, namely why not to apply an RBF kernel to text classification tasks.

I wrote this as a follow up to a Quora Answer on the subject:


I will eventually re-write this entry once I get better at Latex.  For now, refer to 

Smola, Scholkopf, and Muller, The connection between regularization operators and support vector kernels  http://cbio.ensmp.fr/~jvert/svn/bibli/local/Smola1998connection.pdf

I expand on one point–why not to use Radial Basis Function (RBF) Kernels for Text Classification.  I encountered this  while a consultant a few years ago eBay, where not one but 3 of the teams (local, German, and Indian) were all doing this, with no success  They are were treating a multi-class text classification problem using an SVM with an RBF Kernel.  What is worse, they were claiming the RBF calculations…

View original post 686 more words