HIGH QUALITY GRAPH PARTITIONING 3

(1 + )c(V )/k + maxv∈V c(v) for some parameter . The last term in this equation

arises because each node is atomic and therefore a deviation of the heaviest node

has to be allowed. The objective is to minimize the total cut

∑

ij

w(Eij) where

Eij := {{u, v} ∈ E : u ∈ Vi,v ∈ Vj}. A clustering is also a partition of the nodes,

however k is usually not given in advance and the balance constraint is removed. A

vertex v ∈ Vi that has a neighbor w ∈ Vj,i = j, is a boundary vertex. An abstract

view of the partitioned graph is the so called quotient graph, where vertices represent

blocks and edges are induced by connectivity between blocks. Given two clusterings

C1 and C2 the overlay clustering is the clustering where each block corresponds to a

connected component of the graph GE = (V, E\E) where E is the union of the cut

edges of C1 and C2, i.e. all edges that run between blocks in C1 or C2. We will need

the of overlay clustering to define a combine operation on partitions in Section 5.

By default, our initial inputs will have unit edge and node weights. However, even

those will be translated into weighted problems in the course of the algorithm.

A matching M ⊆ E is a set of edges that do not share any common nodes,

i.e., the graph (V, M) has maximum degree one. Contracting an edge {u, v} means

to replace the nodes u and v by a new node x connected to the former neighbors

of u and v. We set c(x) = c(u) + c(v) so the weight of a node at each level is

the number of nodes it is representing in the original graph. If replacing edges of

the form {u, w},{v, w} would generate two parallel edges {x, w}, we insert a single

edge with ω({x, w}) = ω({u, w}) + ω({v, w}). Uncontracting an edge e undoes its

contraction. In order to avoid tedious notation, G will denote the current state of

the graph before and after a (un)contraction unless we explicitly want to refer to

different states of the graph. The multilevel approach to graph partitioning consists

of three main phases. In the contraction (coarsening) phase, we iteratively iden-

tify matchings M ⊆ E and contract the edges in M. Contraction should quickly

reduce the size of the input and each computed level should reflect the structure

of the input network. Contraction is stopped when the graph is small enough to

be directly partitioned using some expensive other algorithm. In the refinement

(or uncoarsening) phase, the matchings are iteratively uncontracted. After uncon-

tracting a matching, a refinement algorithm moves nodes between blocks in order

to improve the cut size or balance.

3. Related Work

There has been a huge amount of research on graph partitioning so that we

refer the reader to [26] for more material on multilevel graph partitioning and to

[15] for more material on genetic approaches for graph partitioning. All general

purpose methods that are able to obtain good partitions for large real world graphs

are based on the multilevel principle outlined in Section 2. Well known software

packages based on this approach include, Jostle [26], Metis [14], and Scotch [20].

KaSPar [19] is a graph partitioner based on the central idea to (un)contract only

a single edge between two levels. KaPPa [13] is a ”classical” matching based MGP

algorithm designed for scalable parallel execution. MQI [16] and Improve [2] are

flow-based methods for improving graph cuts when cut quality is measured by

quotient-style metrics such as expansion or conductance. This approach is only

feasible for k = 2. Improve uses several minimum cut computations to improve

the quotient cut score of a proposed partition. Soper et al. [23] provided the first

algorithm that combined an evolutionary search algorithm with a multilevel graph