Pattern in Labeled Ordered Rooted Trees

Problem1

Let $T$ be a rooted ordered labeled tree. Find all the vertices where all it’s subtrees are equal.

Let $T (v)$ to denote the subtree rooted at $v$ . The two trees are equal if they have the same shape and the same label.

1 Reduce to a string problem

This problem is interesting because one solution can demonstrate the technique of linearize the ordered tree to a string, and apply string algorithms.

First, we replace every edge in the tree with two directed edges $uv$ and $vu$ , where $u$ is closer to the root than $v$ . We label $uv$ with $($ and $vu$ with $)$ . This will be the new tree we work with.

Let $s (T)$ be a string defined by concatenating the labels on the path by traverse the tree with an euler tour by following the edges in a DFS like manner starting from the root of $T$ .

Note $s (T)$ would be a balanced set of parenthesis when $l$ maps vertices to the empty string. Indeed, there is a bijection between unlabeled ordered trees and balanced parenthesis. It’s not hard to see this generalizes to the labeled setting.

Theorem2

If $s (T) = s (T^{'})$ , then $T = T^{'}$ .

A run in a string is a maximal string of the form $a^{n} b$ , where $b$ is a prefix of $a$ and $n \geq 2$ . The runs theorem states there are at most $O (n)$ runs, and all of them can be found in $O (n)$ time[1]. Let $a^{n} b$ be a run in a string, then we call the $a^{n}$ part the complete repetitions.

Define the vertices with at least $2$ child and all it’s subtrees are equal as good vertices. It’s easy to $s (T (v))$ for some good vertex $v$ is going to be a complete repetition!

Now if we found a run, it’s easy to check if it actually correspond to a good vertex in $O (1)$ time once we did a $O (n)$ time preprocessing.

This allows us to solve the problem in $O (n)$ time.

Alternatively, there is a paper with a linear time algorithm to find all subtree repeats inside a tree[2].

2 Reference

References

[1]

M. Crochemore, L. Ilie, Computing longest previous factor in linear time and applications, Information Processing Letters. 106 (2008) 75–80.

[2]

T. Flouri, K. Kobert, SolonP. Pissis, A. Stamatakis, An optimal algorithm for computing all subtree repeats in trees, in: T. Lecroq, L. Mouchard (Eds.), Combinatorial Algorithms, Springer Berlin Heidelberg, 2013: pp. 269–282 10.1007/978-3-642-45278-9_23.

Posted by Chao Xu on 2014-06-05.

Tags: algorithm.