Object segmentation needs to be driven by top-down knowledge to produce semantically meaningful results. In this paper, we propose a supervised segmentation approach that tightly integrates object-level top down information with low-level image cues. The information from the two levels is fused under a kernelized structural SVM learning framework. We defined a novel nonlinear kernel for comparing two image-segmentation masks. This kernel combines four different kernels: the object similarity kernel, the object shape kernel, the per-image color distribution kernel, and the global color distribution kernel. Our experiments show that the structured SVM algorithm finds bad segmentations of the training examples given the current scoring function and punishes these bad segmentations to lower scores than the example (good) segmentations. The result is a segmentation algorithm that not only knows what good segmentations are, but also learns potential segmentation mistakes and tries to avoid them. Our proposed approach can obtain comparable performance to other state-of-the-art top-down driven segmentation approaches yet is flexible enough to be applied to widely different domains.