I was leading
IBM team participation in the EU 6-th framework project Nepomuk, which aimed to
build social semantic desktop.
IBM LanguageWare Miner for Multidimensional Socio-Semantic Networks / Galaxy
There are several reasons why I choose the codename Galaxy: methods for numerical simulation in cosmology I developed before joining IBM, my ideas about building semantic models of texts as “clouds of concepts”, Galaxy’s applications to social network analysis (where "important people" perform like the stars in their communities). I’ll try to explain this in some details.
My work as a visiting professor in Observatoire de la Côte d'Azur was related to the numerical simulation of the process of galaxies and stars formation. Basically, stars appear as the results of collision/adhesion of small dust particles created by the Big Bang. Numerical simulation of this process in fluid dynamics might be done rather straightforwardly by the following algorithm:
a) Compute trajectories of all particles
b) Detect the moment of the first collision
c) Compute trajectory of the new particle resulted from this collision
d) go the step b)
The idea of a better algorithm came to me after deliberation on two principles of locality. Firstly, principle of locality in physics tell us that an object is influenced directly only by its immediate surroundings (in the absence of long distance forces). Secondly, the above described algorithm violates principle of locality in computer sciences; as the result, repeating computations in step b) incur huge memory traffic to extract trajectories of all particles.
These observations gave me the inspiration that it might be possible to model the global phenomena by local computations. High performance of Galaxy (mining of huge networks is usually done in
about 100 milliseconds on an ordinary PC) is due to the fact that all
graph-based algorithms used in Galaxy follow principles of locality.
Working in Observatoire de la Côte d'Azur I invented better algorithm than straighforward algorithm outlined above. Physicist Sergei Gurbatov provided the physical insight into my new "Fast Legendre transform" algorithm, we published our results in the paper "The decay of multiscale signals – deterministic model of the Burgers turbulence".
Google's PageRank, as well as many algorithms in social network analysis which compute centrality measures (who are the most important people in a network) are essentially based on the dynamic model-based view of centrality that focuses on the outcomes for nodes in a network where something is flowing from node to node across the links. Processes of network flow are apparently similar to processes studied in fluid dynamics, and inspiration from physics proved to be useful when l started to work on analysis of massive dynamic socio-semantic networks.
From my previous Projects
At the territory of Chernobyl Nuclear Power Station (sarcophagus over destroyed reactor is visible) left to right: Dr. A. Troussov (Russia), Prof. J. Bonnin (France), Dr. O. Sidletsky (Russia), Dr. A. Novikov (Russia), Prof. A. Gvishiani (Russia)
