investigating computational hardness through realistic ...· investigating computational hardness

Download Investigating Computational Hardness Through Realistic ...· Investigating Computational Hardness

Post on 14-Aug-2018

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • InvestigatingComputationalHardnessThroughRealistic Boolean SAT DistributionsRealisticBooleanSATDistributions

    ThomasLiu&SoumyaKambhampatiPeggyPayneAcademyMcClintockHighSchool

  • Investigating Computational Hardness through Realistic Boolean SAT Distributions

    Abstract: A fundamental question in Computer Science is understanding when a specific class

    of problems go from being computationally easy to hard. Because of its generality and

    applications, the problem of Boolean Satisfiability (aka SAT) is often used as a vehicle for these

    studies. A signal result from these studies is that the hardness of SAT problems exhibits a

    dramatic easy-to-hard phase transition with respect to the problem constrainedness. Past studies

    have however focused mostly on SAT instances generated using uniform random distributions,

    where all constraints are independently generated, and the problem variables are all considered

    of equal importance. These assumptions are unfortunately not satisfied by most real problem.

    Our project aims for a deeper understanding of hardness of SAT problems that arise in practice.

    We study two key questions: (i) How does easy-to-hard transition change with more realistic

    distributions that capture neighborhood sensitivity and rich-get-richer aspects of real problems

    and (ii) Whether the results on random SAT distributions shed light on the hardness of specific

    problems such as Sudoku. Our results, based on extensive empirical studies with our own

    instrumented SAT solver, provide important insights into the practical implications of phase

    transition behavior on SAT problems.

  • Investigating Computational Hardness through Realistic Boolean SAT Distributions

    Executive Summary: In this project, we aimed to understand how real-world computational

    tasks go from being easy to being hard in terms of the time taken to solve them. We used the

    problem of Boolean Satisfiability (SAT) as the focus of our study because many other problems

    can be translated into SAT problems. SAT problems involve assigning values to a set of Boolean

    (True/False) decision variables while satisfying all the given constraints. They have applications

    ranging from scheduling software to decision making for robots.

    Prior research has shown that random SAT problems exhibit a dramatic easy-to-hard

    transition in problem hardness as their constrainedness (number of constraints relative to number

    of decision variables) is varied. The main focus past work has however been on uniform random

    distributions, in which clauses are generated independently of other clauses. Our project aimed

    for a deeper understanding of hardness of SAT problems that arise in practice. We studied two

    key questions: (i) How does easy-to-hard transition change with more realistic distributions that

    capture neighborhood sensitivity and rich-get-richer aspects of real problems and (ii) Whether

    the results on random SAT distributions shed light on the hardness of specific problems such as

    Sudoku. Our results, backed by extensive empirical studies with our own instrumented SAT

    solver, shed considerable light on these questions. With the power law, the centralization of the

    problem around a few key variables that occurred a lot made the problem less hard to solve but

    also less satisfiable. With the neighborhood distribution, localization of the problem to several

    smaller problems also made it easier to solve, but the unsatisfiability of one sub-problem caused

    the entire problem to fail, making it less satisfiable. With Sudoku, we found that harder Sudoku

    problems were indeed harder for the SAT solver, but this was not correlated to constrainedness,

    indicating that further research is needed on the origin of hardness in Sudoku.

  • 1

    Investigating Computational Hardness

    through Realistic Boolean SAT

    Distributions

    Introduction

    A fundamental question in Computer Science is understanding computational

    hardness of problems. Most work on computational hardness characterizes worst case

    complexity of problemsthat is, the amount of computational time taken to solve the

    problem in the worst case. Worst case complexity is however not always a good indicator

    of problem hardness. A class of problems may be hard in the worst case, and still be easy

    to solve in most practical cases. Consequently, researchers are interested in finding

    exactly where the problems go from being easy to hard.

    Work on this question has been done in the context of Boolean satisfiability (or

    SAT) problem [1-4]. Given a set of Boolean constraints on a set of propositions (aka

    variables), the SAT problem is to find a True/False assignment to the variables so that all

    the constraints are satisfied. Because SAT models the problem of choice making under

    constraints, SAT solvers have become an integral part in many computational tasks,

    ranging from scheduling software to decision making for robots. Thus, developing an

    understanding of the difficulty of SAT problems in practice is important.

  • 2

    SAT problems take exponential time (in the number of variables) to solve in the

    worst case, but are, however, easy to solve for most cases in practice. Research in the

    early 90s [1] showed that k-SAT problems (SAT problems with a fixed number of

    variables, k, in every clause) are easy to solve both when the constrainedness (the ratio

    of the number of clauses to the number of variables) is low and when it is high, abruptly

    transitioning from easy to hard in a very narrow region of constrainedness. Most of this

    phase transition studies were done on SAT instances that follow uniform random

    distribution. In such a distribution, variables take part in clauses with uniform probability,

    and clauses are independent (uncorrelated).

    The assumptions of uniform random distribution are not satisfied when we

    consider SAT instances that result from the compilation of real problems. Our project

    thus aims for a deeper understanding of the hardness of SAT problems that arise in

    practice. In particular, we study these two key questions:

    1. How does the phase transition behavior change with more realistic and natural

    distributions of SAT problems?

    2. Do SAT phase transitions allow us to predict hardness of other real world

    problems that can be compiled (translated) into SAT?

    In aid of the first, we considered SAT instances generated according to two non-uniform

    distributions: (i) power-law or rich-get-richer distribution and (ii) neighborhood-sensitive

    distributions. For the second question, we focused on SAT instances that come from

    Sudoku problems. We developed our own SAT solver and conducted a large-scale

    systematic empirical study on the phase transition behavior of these SAT instances. Our

  • 3

    hypothesis is that these studies will shed light on how easy-to-hard problem transition in

    real-world problems differs from uniform random distribution.

    Background

    A Boolean Satisfiability Problem (SAT) is the problem of determining if the

    variables in a Boolean formula can be set to True/False values so as to make the entire

    Boolean formula return true. If this is possible, the problem is called satisfiable. If not,

    the problem is unsatisfiable. For example, given a Boolean formula (P AND (P IMPLIES

    Q)), the assignment P=True, Q=True is a satisfying assignment. Given n variables, there

    are 2n

    possible assignments, and in the worst case, a solver has to check through each

    one of them. Indeed, SAT problems are known to be NP-Complete [1].

    CNF Clauses: The constraints in a general Boolean satisfiability problem can be any

    arbitrary propositional logic formula (involving negation, conjunction, disjunction and

    implication). CNF, or Conjunctive Normal Form, is a simpler (normal) form where each

    constraint (also called clause) is a disjunction of normal or negated variables (aka

    literals). Any Boolean formula can be easily converted to an equivalent set of CNF

    clauses [3]. For example, the constraint (P IMPLIES Q) can be converted to the clause

    (~P OR Q) (where ~stands for negation). The length of the clause is the number of

    variables in it.

    k-SAT: A general SAT problem, converted to CNF form, might normally result in

    clauses of varying lengths. A k-SAT problem instance is one where all the clauses are

    exactly length k. Interestingly, while 1-SAT and 2-SAT are both easy to solve (worst-

    case polynomial), from 3-SAT onwards, the worst case complexity becomes exponential.

  • 4

    Thus scientists often study 3-SAT problems as a stand-in for the general Boolean

    satisfiability problems [4].

    Phase Transition in SAT: While SAT problems are worst case exponential, it has long

    been known that most random SAT instances are easy to solve. This brings up the

    question: where are the hard SAT problems? Researchers have tried to answer this

    question by experimenting with randomly generated SAT instances [1]. They have found

    that SAT problems instances can be characterized by their constrainedness () which is

    the ratio of the number of clauses (constraints) to the number of variables. Hard

    problems, or problems that take SAT solvers the maximum effort to solve, occur in a

Recommended

View more >