Utilizing Apache Hadoop in Clique Detection Methods

Authors

  • László Kovács
  • Gábor Szabó

Keywords:

graph algorithms, clique detection, produce architecture, Apache Hadoop, parallel systems

Abstract

There are many areas in information technology and mathe­matics where we have to process large graphs, for example data mining based on social networks, route problems, etc. Many of these areas re­quire us to explore the connections among nodes and find all the maximal cliques in the graphs, i.e. all the node sets whose members are mutu­ally connected with each other. One possible and widely used clique detection method is the so-called Eron-Kerbosch algorithm. However, this technique alone might be too slow for big graphs, thus posting the method into a massively parallel system can reduce the overall runtime. This paper introduces some possibilities and starting points in utilizing the open source Apache Hadoop framework that can help in using the resources of multiple computers. The so-called MapReduce architecture makes it possible to divide and conquer the big task into smaller chunks and eventually solve the problem faster than the equivalent sequential methods.

Downloads

Published

2015-12-30