However, with database tricks such as binary-tree index, we can make the above algorithm very efficient. Thus, it is helpful for algorithm guys like mathematicians to understand some database technologies. With those database tricks, we can implement algorithms that may be hard to do otherwise.
I am compiling a list of leading data providers. This is the current list. If your favorite data providers are not included here and you feel they should be, please contact me at email@example.com. Thanks. Dr. Zhou.
Friday, September 14, 2012
Relational database and data mining algorithms
Some algorithms may appear prohibitively expensive to perform computationally. One of such examples is finding for each data point in table A the nearest K neighboring data points in table B (shown in the figure below). From the surface, it involves calculating the distances between 2 trillion pairs of data points (1 million times 2 million).