After a fabulous training week in Jülich (which was already covered by many other SoHPC participants e.g. here) I arrived to the European Green Capital – Ljubljana. The city is really beautiful and you really should visit this place. However, my blog post will be about one serious issue we have here.
Every day in the morning we cycle to the university from our dorm thanks to the city bike system and the extensive net of cycle paths. After a few hours of hard work, there is a time for the most important point in our day schedule: a lunch. Lunch in Ljubljana take us usually a whole hour which is a very nice change after working in Poland. In Poznan almost always I have my lunch in the rush, between classes or in the middle of work, by picking something from the university
canteen. Here, we have a calm walk to the one of local restaurants where we order something from the lunch offers menus which are pretty cheap and tasty (a perfect and rare combination). After having a lunch and a nice conversation in the meantime, we head to one of the coffee houses to take a cup of coffee with milk. Of course, we have a pretty good coffee machine at the university where we can take as much coffee as we want, however, citing one of the famous personage of SoHPC “in the coffee house they prepare a coffee for you and you can feel as a king at least for a minute“. Once we had a coffee on the top floor of one of the Ljubljana’s skyscrapers, another time we were in the vegan place (great place for a masochists coffee drinkers) and sometimes we pick one of the coffees on the picturesque alleys.
I suppose that you didn’t discover our main problem yet. Well, in Ljubljana there is so many different restaurants (and coffees!) that we usually can’t decide where to go! It’s really terrifying, because the lunch should be the time of relax and instead every single day we are facing a very hard, multi-criteria decision making & optimization problem. I’ve already discovered some heuristics which we follow. If Sam is really hungry we should go to Maxi where he can get a big schnitzel with a plenty of potatoes. We haven’t repeated our mistake and we have never been in the vegan coffee ever again. The only chance that we will eat in “Daisy” is when Leon is coming with us. But besides this simple rules we usually have no clue where to go… And at this point HPC takes the stage! Why? Because we are able to formulate our struggle as a linkage prediction problem.
We can construct a graph with nodes representing the Ljubljana’s restaurants and coffee places, with one additional node for our university/starting position. Later, to each node we can add some additional information to the nodes: the distance from the university, if we were there before (and how many times), if we liked the food there and when we were there for the last time (we usually don’t repeat restaurants in the consecutive days). We can also provide some additional information e.g. “is Sam very hungry?”, “is Leon coming with us?” etc.
Now, the problem is reduced to the prediction of the edges in the graph. For example, if we decide to go to Maxi, and later to the coffee on the street, the algorithm should predict three edges in our graph. One edge from the university to Maxi, one from Maxi to the coffee and one from the coffee to the university. How the computer can predict where we should go? Well, we can provide to the algorithm the information of the restaurants in Ljubljana (with all additional information mentioned before) and the history of our past lunches. Basing on that the program can e.g. extract frequent patterns of our decisions and infer the knowledge about about them. This inferred knowledge is later used to predict edges in our graph in the future.
Unfortunately, the computer usually needs to consider all possible edges to determine which of them are the most probable ones. In our simple graph this is not a problem at all, but if we consider much bigger graph, which occurs in more important (???), scientific applications then we are in a serious trouble. Why? For instance, if our graph contains 200 000 nodes, than there is almost 20 000 000 000 possible edges! My SoHPC project aims to solve this problem using huge computational resources offered by HPC. Do you want to know how I will do it? Wait for my next post 😉