3. CHALLENGE DESCRIPTION ix
3. Challenge Description
3.1. Data Sets. The collection of benchmark inputs of the 10th DIMACS
Implementation Challenge includes both synthetic and real-world data. All graphs
are undirected. Formerly directed instances were symmetrized by making every
directed edge undirected. While this procedure necessarily loses information in a
number of real-world applications, it appeared to be necessary since most existing
software libraries can handle undirected graphs only. Directed graphs (or unsym-
metric matrices) are left for further work.
Synthetic graphs in the collection include random graphs (Erd˝ os-R´ enyi, R-
MAT, random geometric graphs using the unit disk model), Delaunay triangula-
tions, and graphs that mimic meshes from dynamic numerical simulations. Real-
world inputs consist of co-author and citation networks, road networks, numerical
simulation meshes, web graphs, social networks, computational task graphs, and
graphs from adapting voting districts (redistricting).
For the actual challenge two subsets were chosen, one for graph partitioning
and one for graph clustering. The first one (for graph partitioning) contained 18
graphs, which had to be partitioned into 5 different numbers of parts each, yielding
90 problem instances. The second one (for graph clustering) contained 30 graphs.
Due to the choice of objective functions for graph clustering, no restriction on the
number of parts or their size was necessary in this category.
3.2. Categories. One of the main goals of the challenge was to compare dif-
ferent techniques and algorithmic approaches. Therefore participants were invited
to join different challenge competitions aimed at assessing the performance and
solution quality of different implementations. Let G = (V, E, ω) be an undirected
graph with edge weight function ω.
3.2.1. Graph Partitioning. Here the task was to compute a partition Π of the
vertex set V into k parts of size at most (1 + )
. The two objective functions
used to assess the partitioning quality were edge cut (EC, total number of edges
with endpoints in different parts) and maximum communication volume (CV). CV
sums for each part p and each vertex v therein the number of parts adjacent to v
but different from p. The final result is the maximum over each part.
For each instance result (EC and CV results were counted as one instance each),
the solvers with the first six ranks received a descending number of points (10, 6,
4, 3, 2, 1), a scoring system borrowed from former Formula 1 rules.
Three groups submitted solutions to the graph partitioning competition. Only
one of the submitted solvers is a graph partitioner by nature, the other two are
actually hypergraph partitioners. Both hypergraph partitioners use multilevel re-
cursive bisection. While their quality, in particular for the communication volume,
is generally not bad, the vast majority of best ranked solutions (139 out of 170) are
held by the graph partitioner KaPa.
3.2.2. Graph Clustering. The clustering challenge was divided into two separate
competitions with different optimization criteria. For the first competition the
objective modularity had to be optimized. Modularity has been a very popular
measure in the last years, in particular in the field of community detection. It
follows the intra-cluster-density vs. inter-cluster-sparsity paradigm. However, some