Pattern in Labeled Ordered Rooted Trees
Let be a rooted ordered labeled tree. Find all the vertices where all it’s subtrees are equal.
Let to denote the subtree rooted at . The two trees are equal if they have the same shape and the same label.
1 Reduce to a string problem
This problem is interesting because one solution can demonstrate the technique of linearize the ordered tree to a string, and apply string algorithms.
First, we replace every edge in the tree with two directed edges and , where is closer to the root than . We label with and with . This will be the new tree we work with.
Let be a string defined by concatenating the labels on the path by traverse the tree with an euler tour by following the edges in a DFS like manner starting from the root of .
Note would be a balanced set of parenthesis when maps vertices to the empty string. Indeed, there is a bijection between unlabeled ordered trees and balanced parenthesis. It’s not hard to see this generalizes to the labeled setting.
If , then .
A run in a string is a maximal string of the form , where is a prefix of and . The runs theorem states there are at most runs, and all of them can be found in time[1]. Let be a run in a string, then we call the part the complete repetitions.
Define the vertices with at least child and all it’s subtrees are equal as good vertices. It’s easy to for some good vertex is going to be a complete repetition!
Now if we found a run, it’s easy to check if it actually correspond to a good vertex in time once we did a time preprocessing.
This allows us to solve the problem in time.
Alternatively, there is a paper with a linear time algorithm to find all subtree repeats inside a tree[2].