3. CHALLENGE DESCRIPTION ix

3. Challenge Description

3.1. Data Sets. The collection of benchmark inputs of the 10th DIMACS

Implementation Challenge includes both synthetic and real-world data. All graphs

are undirected. Formerly directed instances were symmetrized by making every

directed edge undirected. While this procedure necessarily loses information in a

number of real-world applications, it appeared to be necessary since most existing

software libraries can handle undirected graphs only. Directed graphs (or unsym-

metric matrices) are left for further work.

Synthetic graphs in the collection include random graphs (Erd˝ os-R´ enyi, R-

MAT, random geometric graphs using the unit disk model), Delaunay triangula-

tions, and graphs that mimic meshes from dynamic numerical simulations. Real-

world inputs consist of co-author and citation networks, road networks, numerical

simulation meshes, web graphs, social networks, computational task graphs, and

graphs from adapting voting districts (redistricting).

For the actual challenge two subsets were chosen, one for graph partitioning

and one for graph clustering. The first one (for graph partitioning) contained 18

graphs, which had to be partitioned into 5 different numbers of parts each, yielding

90 problem instances. The second one (for graph clustering) contained 30 graphs.

Due to the choice of objective functions for graph clustering, no restriction on the

number of parts or their size was necessary in this category.

3.2. Categories. One of the main goals of the challenge was to compare dif-

ferent techniques and algorithmic approaches. Therefore participants were invited

to join different challenge competitions aimed at assessing the performance and

solution quality of different implementations. Let G = (V, E, ω) be an undirected

graph with edge weight function ω.

3.2.1. Graph Partitioning. Here the task was to compute a partition Π of the

vertex set V into k parts of size at most (1 + )

|V |

k

. The two objective functions

used to assess the partitioning quality were edge cut (EC, total number of edges

with endpoints in different parts) and maximum communication volume (CV). CV

sums for each part p and each vertex v therein the number of parts adjacent to v

but different from p. The final result is the maximum over each part.

For each instance result (EC and CV results were counted as one instance each),

the solvers with the first six ranks received a descending number of points (10, 6,

4, 3, 2, 1), a scoring system borrowed from former Formula 1 rules.

Three groups submitted solutions to the graph partitioning competition. Only

one of the submitted solvers is a graph partitioner by nature, the other two are

actually hypergraph partitioners. Both hypergraph partitioners use multilevel re-

cursive bisection. While their quality, in particular for the communication volume,

is generally not bad, the vast majority of best ranked solutions (139 out of 170) are

held by the graph partitioner KaPa.

3.2.2. Graph Clustering. The clustering challenge was divided into two separate

competitions with different optimization criteria. For the first competition the

objective modularity had to be optimized. Modularity has been a very popular

measure in the last years, in particular in the field of community detection. It

follows the intra-cluster-density vs. inter-cluster-sparsity paradigm. However, some