I'm an Assosiate Professor at UESTC, a member of the Algorithms and Logic Group. My research interests are algorithms, combinatorial optimization and computational geometry. I'm also interested in more applied problems with nice theoretical components. Here is my CV.
I maintain a github account, a blog and a Mastodon @chao. I frequent cstheory.
Here is my academic family.
You can contact me through email [email protected].
COCOON 2024
A subset B of the ring \Z_n is referred to as a \ell-covering set if \{ ab \pmod n \mid 0\leq a \leq \ell, b\in B\} = \Z_n. We show that there exists a \ell-covering set of \Z_n of size O(\frac{n}{\ell}\log n) for all n and \ell, and how to construct such a set. We also provide examples where any \ell-covering set must have a size of \Omega(\frac{n}{\ell}\frac{\log n}{\log \log n}). The proof employs a refined bound for the relative totient function obtained through sieve theory and the existence of a large divisor with a linear divisor sum. The result can be used to simplify a modular subset sum algorithm.
IJTCS-FAW 2024
Nested ticketing represents a prevalent practice in airline bookings, employed to bypass certain airline ticketing regulations with the aim of reducing costs on multiple round-trip tickets. We consider the computational complexity of nested ticketing, and it is 'covert' version, which we call interleaved ticketing. Consider multiple agents living in different locations and a single client who demands that one agent be present each week. The objective for the company is to schedule these agents and arrange their flight itineraries to minimize the cost. We show that when nested ticketing is allowed, the problem is NP-hard when there are at least two agents. We also show there exists a polynomial-time algorithm if only interleaved ticketing is allowed, and the number of airlines is bounded.
IPCO 2024
Consider a matroid where all elements are labeled with an element in \mathbb{Z}. We are interested in finding a base where the sum of the labels is congruent to g \pmod m. We show that this problem can be solved in \tilde{O}(2^{4m} n r^{5/6}) time for a matroid with n elements and rank r, when m is either the product of two primes or a prime power. The algorithm can be generalized to all moduli and, in fact, to all abelian groups if a classic additive combinatorics conjecture by Schrijver and Seymour holds true. We also discuss the optimization version of the problem.
COCOON .
Let \mathbb{F} be any field, we consider solving Ax=b for a matrix A\in\mathbb{F}^{n\times n} of m non-zero elements and b\in\mathbb{F}^{n}. If we are given a zero forcing set of A of size k, we can solve the linear equation in O(mk+k^\omega) time, where \omega is the matrix multiplication exponent. As an application, we show how the lights out game in an n\times n grid is solved in O(n^3) time, and then improve the running time to O(n^\omega\log n) by exploiting the repeated structure in grids.
SODA .
In this paper, we study the minimum k-partition problem of submodular functions, i.e., given a finite set V and a submodular function f:2^V\to \R, computing a k-partition \{ V_1, \ldots, V_k \} of V with minimum \sum_{i=1}^k f(V_i). The problem is a natural generalization of the minimum k-cut problem in graphs and hypergraphs. It is known that the problem is NP-hard for general k, and solvable in polynomial time for k \leq 3. In this paper, we construct the first polynomial-time algorithm for the minimum 4-partition problem.
NeurIPS .
Decision trees are well-known due to their ease of interpretability. To improve accuracy, we need to grow deep trees or ensembles of trees. These are hard to interpret, offsetting their original benefits. Shapley values have recently become a popular way to explain the predictions of tree-based machine learning models. It provides a linear weighting to features independent of the tree structure. The rise in popularity is mainly due to TreeShap, which solves a general exponential complexity problem in polynomial time. Following extensive adoption in the industry, more efficient algorithms are required. This paper presents a more efficient and straightforward algorithm: Linear TreeShap. Like TreeShap, Linear TreeShap is exact and requires the same amount of memory.
FSTTCS .
Submodular functions are fundamental to combinatorial optimization. Many interesting problems can be formulated as special cases of problems involving submodular functions. In this work, we consider the problem of approximating symmetric submodular functions everywhere using hypergraph cut functions. Devanur, Dughmi, Schwartz, Sharma, and Singh showed that symmetric submodular functions over n-element ground sets cannot be approximated within (n/8)-factor using a graph cut function and raised the question of approximating them using hypergraph cut functions. Our main result is that there exist symmetric submodular functions over n-element ground sets that cannot be approximated within a o(n^{1/3}/\log^2 n)-factor using a hypergraph cut function. On the positive side, we show that symmetrized concave linear functions and symmetrized rank functions of uniform matroids and partition matroids can be constant-approximated using hypergraph cut functions.
MFCS .
The Traveling Tournament Problem (TTP) is a well-known benchmark problem in the field of tournament timetabling, which asks us to design a double round-robin schedule such that each pair of teams plays one game in each other's home venue, minimizing the total distance traveled by all n teams (n is even). TTP-k is the problem with one more constraint that each team can have at most k consecutive home games or away games. The case where k=3, TTP-3, is one of the most investigated cases. In this paper, we improve the approximation ratio of TTP-3 from (1.667+\epsilon) to (1.598+\epsilon), for any \epsilon>0. Previous schedules were constructed based on a Hamiltonian cycle of the graph. We propose a novel construction based on triangle packing. Then, by combining our triangle packing schedule with the Hamiltonian cycle schedule, we obtain the improved approximation ratio. The idea of our construction can also be extended to k\geq 4. We demonstrate that the approximation ratio of TTP-4 can be improved from (1.750+\epsilon) to (1.700+\epsilon) by the same method. As an additional product, we also improve the approximation ratio of LDTTP-3 (TTP-3 where all teams are allocated on a straight line) from 4/3 to (6/5+\epsilon).
Urban Complex Systems .
Street parking spots for automobiles are a scarce commodity in most urban environments. The heterogeneity of car sizes makes it inefficient to rigidly define fixed-sized spots. Instead, unmarked streets in cities like New York leave placement decisions to individual drivers, who have no direct incentive to maximize street utilization. In this paper, we explore the effectiveness of two different behavioral interventions designed to encourage better parking, namely (1) educational campaigns to encourage parkers to "kiss the bumper" and reduce the distance between themselves and their neighbors, or (2) painting appropriately-spaced markings on the street and urging drivers to "hit the line". Through analysis and simulation, we establish that the greatest densities are achieved when lines are painted to create spots roughly twice the length of average-sized cars. Kiss-the-bumper campaigns are in principle more effective than hit-the-line for equal degrees of compliance, although we believe that the visual cues of painted lines induce better parking behavior.
RANDOM .
We address counting and optimization variants of multicriteria global min-cut and size-constrained min-k-cut in hypergraphs.
SOSA .
Karger used spanning tree packings to derive a near linear-time randomized algorithm for the global minimum cut problem as well as a bound on the number of approximate minimum cuts. This is a different approach from his well-known random contraction algorithm. Thorup developed a fast deterministic algorithm for the minimum k-cut problem via greedy recursive tree packings. In this paper we revisit properties of an LP relaxation for k-cut proposed by Naor and Rabani, and analyzed by Chekuri, Guha and Naor. We show that the dual of the LP yields a tree packing, that when combined with an upper bound on the integrality gap for the LP, easily and transparently extends Karger's analysis for mincut to the k-cut problem. In addition to the simplicity of the algorithm and its analysis, this allows us to improve the running time of Thorup's algorithm by a factor of n. We also improve the bound on the number of \alpha-approximate k-cuts. Second, we give a simple proof that the integrality gap of the LP is 2(1-1/n). Third, we show that an optimum solution to the LP relaxation, for all values of k, is fully determined by the principal sequence of partitions of the input graph. This allows us to relate the LP relaxation to the Lagrangean relaxation approach of Barahona and Ravi and Sinha; it also shows that the idealized recursive tree packing considered by Thorup gives an optimum dual solution to the LP. This work arose from an effort to understand and simplify the results of Thorup.
SODA .
In the hypergraph k-cut problem, the input is a hypergraph, and the goal is to find a smallest subset of hyperedges whose removal ensures that the remaining hypergraph has at least k connected components. This problem is known to be at least as hard as the densest k-subgraph problem when k is part of the input (Chekuri-Li, 2015). We present a randomized polynomial time algorithm to solve the hypergraph k-cut problem for constant k. Our algorithm solves the more general hedge k-cut problem when the subgraph induced by every hedge has a constant number of connected components. In the hedge k-cut problem, the input is a hedgegraph specified by a vertex set and a disjoint set of hedges, where each hedge is a subset of edges defined over the vertices. The goal is to find a smallest subset of hedges whose removal ensures that the number of connected components in the remaining underlying (multi-)graph is at least k. Our algorithm is based on random contractions akin to Karger's min cut algorithm. Our main technical contribution is a distribution over the hedges (hyperedges) so that random contraction of hedges (hyperedges) chosen from the distribution succeeds in returning an optimum solution with large probability.
APPROX .
The computational complexity of multicut-like problems may vary significantly depending on whether the terminals are fixed or not. In this work we present a comprehensive study of this phenomenon in two types of cut problems in directed graphs: double cut and bicut.
Our techniques for the algorithms are combinatorial, based on LPs and based on enumeration of approximate min-cuts. Our hardness results are based on combinatorial reductions and integrality gap instances.
SODA .
Given a multiset S of n positive integers and a target integer t, the subset sum problem is to decide if there is a subset of S that sums up to t. We present a new divide-and-conquer algorithm that computes all the realizable subset sums up to an integer u in \tilde{O}\left(\min\{n\sqrt{u},u^{4/3},\sigma\}\right), where \sigma is the sum of all elements in S and \tilde{O} hides polylogarithmic factors. This result improves upon the standard dynamic programming algorithm that runs in O(nu) time. To the best of our knowledge, the new algorithm is the fastest general algorithm for this problem. We also present a modified algorithm for cyclic groups, which computes all the realizable subset sums within the group in \tilde{O}\left(\min\{n\sqrt{m},m^{5/4}\}\right) time, where m is the order of the group.
SODA .
We study algorithmic and structural aspects of connectivity in hypergraphs. Given a hypergraph H=(V,E) with n=|V|, m=|E| and p=\sum_{e\in E}|e| the best known algorithm to compute a global minimum cut in H runs in time O(np) for the uncapacitated case and in O(np+n^2\log n) time for the capacitated case. We show the following new results.
ESA .
The notion of element-connectivity has found several important applications in network design and routing problems. We focus on a reduction step that preserves the element-connectivity, which when applied repeatedly allows one to reduce the original graph to a simpler one. This pre-processing step is a crucial ingredient in several applications. In this paper we revisit this reduction step and provide a new proof via the use of setpairs. Our main contribution is algorithmic results for several basic problems on element-connectivity including the problem of achieving the aforementioned graph simplification. We utilize the underlying submodularity properties of element-connectivity to derive faster algorithms.
SODA .
A closed curve in the plane is weakly simple if it is the limit (in the Fréchet metric) of a sequence of simple closed curves. We describe an algorithm to determine whether a closed walk of length n in a simple plane graph is weakly simple in O(n \log n) time, improving an earlier O(n^3)-time algorithm of Cortese et al.. As an immediate corollary, we obtain the first efficient algorithm to determine whether an arbitrary n-vertex polygon is weakly simple; our algorithm runs in O(n^2 \log n) time. We also describe algorithms that detect weak simplicity in O(n \log n) time for two interesting classes of polygons. Finally, we discuss subtle errors in several previously published definitions of weak simplicity.
Mathematical Programming .
In this paper, we study the minimum k-partition problem of submodular functions, i.e., given a finite set V and a submodular function f:2^V\to \mathbb{R}, computing a k-partition \{V_1,\ldots,V_k\} of V with minimum \sum_{i=1}^k f(V_i). The problem is a natural generalization of the minimum k-cut problem in graphs and hypergraphs. It is known that the problem is NP-hard for general k, and solvable in polynomial time for fixed k\leq 3. In this paper, we construct the first polynomial-time algorithm for the minimum 4-partition problem.
Mathematical Programming .
We address counting and optimization variants of multicriteria global min-cut and size-constrained min-k-cut in hypergraphs.
European Journal of Operational Research .
We describe an algorithm for the high multiplicity asymmetric traveling salesman problem with feedback vertex set of size k (HMTSPFVS-kFVS) where each vertex can be visited a certain number of times and each cycle in a solution contains at least one vertex from the feedback vertex set. We show how it can be used to improve algorithms in automated storage and retrieval systems. An automated storage and retrieval system includes storage blocks and storage and retrieval machines that either move to retrieve unit loads from their current locations in the system to a depot or take unit loads from a depot and store them to specific locations in the system. Given n storage and retrieval requests in a system with k depots and one storage and retrieval machine, we show that our algorithm for HMTSPFVS-kFVS can solve the problem of minimizing total traveling time of the storage and retrieval machine in O(n^k+n^3) time when all depots are specialized (each depot fulfills one type of requests) and in O(n^{2k}+n^3) time when depots are regular (each depot fulfills both types of requests). The best previous algorithm only solves the special case of the problem with 2 regular depots in O(n^6) time. The applicability of our algorithm for several generalizations and special cases of the problem is also discussed. Furthermore, to evaluate the performance of our solution method, we perform extensive numerical experiments.
SIAM Journal on Discrete Mathematics .
The minimum violation problem asks for a vertex map from a digraph to a pattern digraph that minimizes violation, the total weight of the edges not mapped to an edge. We are interested in surjective mappings. We characterize all patterns where a minimum violation map that fixes some vertices can be computed in polynomial time. We also make progress in the case where we do not fix any vertex in the mapping, including when the digraph is disconnected, when the graph is in the variety of finite paths. Moreover, we obtain a dichotomy result for trees. We apply the result to some cut problems, including k-cut with size lower bounds and length bounded k-cuts.
SIAM Journal on Discrete Mathematics .
Karger used spanning tree packings [D. R. Karger, J. ACM, 47 (2000), pp. 46--76] to derive a near linear-time randomized algorithm for the global minimum cut problem as well as a bound on the number of approximate minimum cuts. This is a different approach from his well-known random contraction algorithm [D. R. Karger, Random Sampling in Graph Optimization Problems, Ph.D. thesis, Stanford University, Stanford, CA, 1995, D. R. Karger and C. Stein, J. ACM, 43 (1996), pp. 601--640]. Thorup developed a fast deterministic algorithm for the minimum k-cut problem via greedy recursive tree packings [M. Thorup, Minimum k-way cuts via deterministic greedy tree packing, in Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing, ACM, 2008, pp. 159--166]. In this paper we revisit properties of an LP relaxation for cͅut proposed by Naor and Rabani [Tree packing and approximating k-cuts, in Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms, Vol. 103, SIAM, Philadelphia, 2001, pp. 26--27], and analyzed in [C. Chekuri, S. Guha, and J. Naor, SIAM J. Discrete Math., 20 (2006), pp. 261--271]. We show that the dual of the LP yields a tree packing that, when combined with an upper bound on the integrality gap for the LP, easily and transparently extends Karger's analysis for mincut to the k-cut problem. In addition to the simplicity of the algorithm and its analysis, this allows us to improve the running time of Thorup's algorithm by a factor of n. We also improve the bound on the number of \alpha-approximate k-cuts. Second, we give a simple proof that the integrality gap of the LP is 2(1-1/n). Third, we show that an optimum solution to the LP relaxation, for all values of k, is fully determined by the principal sequence of partitions of the input graph. This allows us to relate the LP relaxation to the Lagrangean relaxation approach of Barahona [Oper. Res. Lett., 26 (2000), pp. 99--105] and Ravi and Sinha [European J. Oper. Res., 186 (2008), pp. 77--90]; it also shows that the idealized recursive tree packing considered by Thorup gives an optimum dual solution to the LP.
Mathematical Programming .
For a fixed integer k\geq 2, the hypergraph k-cut problem asks for a smallest subset of hyperedges whose removal leads to at least k connected components in the remaining hypergraph. While graph k-cut is solvable efficiently (Goldschmidt and Hochbaum in Math. Oper. Res. 19(1):24–37, 1994), the complexity of hypergraph k-cut has been open. In this work, we present a randomized polynomial time algorithm to solve the hypergraph k-cut problem. Our algorithmic technique extends to solve the more general hedge k-cut problem when the subgraph induced by every hedge has a constant number of connected components. Our algorithm is based on random contractions akin to Karger’s min cut algorithm. Our main technical contribution is a non-uniform distribution over the hedges (hyperedges) so that random contraction of hedges (hyperedges) chosen from the distribution succeeds in returning an optimum solution with large probability. In addition, we present an alternative contraction based randomized polynomial time approximation scheme for hedge k-cut in arbitrary hedgegraphs (i.e., hedgegraphs whose hedges could have a large number of connected components). Our algorithm and analysis also lead to bounds on the number of optimal solutions to the respective problems.
ACM Transactions on Algorithms .
Given a (multi) set S of n positive integers and a target integer u, the subset sum problem is to decide if there is a subset of S that sums up to u. We present a series of new algorithms that compute and return
Mathematical Programming .
In the fixed-terminal bicut problem, the input is a directed graph with two specified nodes s and t and the goal is to find a smallest subset of edges whose removal ensures that s cannot reach t and t cannot reach s. In the global bicut problem, the input is a directed graph and the goal is to find a smallest subset of edges whose removal ensures that there exist two nodes s and t such that s cannot reach t and t cannot reach s. Fixed-terminal bicut and global bicut are natural extensions of \{s,t\}-min cut and global min-cut respectively, from undirected graphs to directed graphs. Fixed-terminal bicut is NP-hard, admits a simple 2-approximation, and does not admit a (2-\e)-approximation for any constant \e>0 assuming the unique games conjecture. In this work, we show that global bicut admits a (2-1/448)-approximation, thus improving on the approximability of the global variant in comparison to the fixed-terminal variant.
SIAM Journal on Computing .
We study algorithmic and structural aspects of connectivity in hypergraphs. Given a hypergraph H=(V,E) with n = |V|, m = |E| and p = \sum_{e \in E} |e| the fastest known algorithm to compute a global minimum cut in H runs in O(np) time for the uncapacitated case, and in O(np + n^2 \log n) time for the capacitated case. We show the following new results. problems in hypergraphs of small rank.
Our results build upon properties of vertex orderings that were inspired by the maximum adjacency ordering for graphs due to Nagamochi and Ibaraki. Unlike graphs we observe that there are several orderings for hypergraphs and these yield different insights.
Information Processing Letters .
We consider a problem in descriptive kinship systems, namely finding the shortest sequence of terms that describes the kinship between a person and his/her relatives. The problem reduces to finding the minimum weight path in a labeled graph where the label of the path comes from a regular language. The running time of the algorithm is O(n^3+s), where n and s are the input size and the output size of the algorithm, respectively.
Operations Research Letters .
For a simple undirected graph with n vertices and m edges, we consider a data structure that given a query of a pair of vertices u, v and an integer k\geq 1, it returns k edge-disjoint uv-paths. The data structure takes \tilde{O}(n^{3.375}) time to build, using O(mn^{1.5}\log n) space, and each query takes O(kn) time, which is optimal and beats the previous query time of O(kn\alpha(n)).
Congressus Numerantium .
In the game of Graph Nim, players take turns removing one or more edges incident to a chosen vertex in a graph. The player that removes the last edge in the graph wins. A spider graph is a champion if it has a Sprague-Grundy number equal to the number of edges in the graph. We investigate the the Sprague-Grundy numbers of various spider graphs when the number of paths or length of paths increase.
, Submitted.
2019, Submitted.
The capacitated vehicle routing problem (CVRP) is one of the most well known NP-hard combinatorial optimization problems. Single depot CVRP with a general metric is NP-hard even for fixed capacity 3, while polynomial time solvable for fixed capacity 2. We consider the variant of CVRP where restocking is available. We show that if there is a constant number of depots, then the problem can be solved in polynomial time when capacity is 2.
.
In this thesis, we consider cut and connectivity problems on graphs, digraphs, hypergraphs and hedgegraphs. The main results are the following: