Skip to content

Commit 36845d7

Browse files
hitonanodeweb-flow
andauthored
Add enumerating Lyndon words (#115)
* Add lyndon enumeration * Rename lyndon files * [auto-verifier] verify commit f7d5554 * [auto-verifier] verify commit afea3a5 Co-authored-by: GitHub <[email protected]>
1 parent 65eba63 commit 36845d7

File tree

6 files changed

+116
-55
lines changed

6 files changed

+116
-55
lines changed

.verify-helper/timestamps.remote.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -169,8 +169,8 @@
169169
"string/test/rolling_hash.test.cpp": "2021-03-13 17:28:18 +0900",
170170
"string/test/rolling_hash_lcp.test.cpp": "2021-03-13 17:28:18 +0900",
171171
"string/test/rolling_hash_w_modint.test.cpp": "2021-06-06 14:54:00 +0900",
172-
"string/test/run_enumerate_lyndon_hash.test.cpp": "2021-03-14 17:31:33 +0900",
173-
"string/test/run_enumerate_lyndon_rmq.test.cpp": "2021-05-01 20:55:29 +0900",
172+
"string/test/run_enumerate_lyndon_hash.test.cpp": "2021-09-18 14:55:44 +0900",
173+
"string/test/run_enumerate_lyndon_rmq.test.cpp": "2021-09-18 14:55:44 +0900",
174174
"string/test/sa_count_keyword.reader.test.cpp": "2021-08-01 21:42:17 +0900",
175175
"string/test/sa_count_keyword.test.cpp": "2021-01-02 01:50:58 +0900",
176176
"string/test/suffix_array.test.cpp": "2021-01-02 00:51:41 +0900",

string/lyndon_factorization.hpp renamed to string/lyndon.hpp

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
#pragma once
22
#include <algorithm>
3+
#include <cassert>
4+
#include <functional>
35
#include <string>
46
#include <tuple>
57
#include <utility>
@@ -11,12 +13,13 @@
1113
// Reference:
1214
// [1] K. T. Chen, R. H. Fox, R. C. Lyndon,
1315
// "Free Differential Calculus, IV. The Quotient Groups of the Lower Central Series,"
14-
// Annals of Mathematics, 81-95, 1958.
16+
// Annals of Mathematics, 68(1), 81-95, 1958.
1517
// [2] J. P. Duval, "Factorizing words over an ordered alphabet,"
1618
// Journal of Algorithms, 4(4), 363-381, 1983.
1719
// - https://cp-algorithms.com/string/lyndon_factorization.html
1820
// - https://qiita.com/nakashi18/items/66882bd6e0127174267a
19-
template <typename T> std::vector<std::pair<int, int>> lyndon_factorization(const std::vector<T> &S) {
21+
template <typename T>
22+
std::vector<std::pair<int, int>> lyndon_factorization(const std::vector<T> &S) {
2023
const int N = S.size();
2124
std::vector<std::pair<int, int>> ret;
2225
for (int l = 0; l < N;) {
@@ -41,7 +44,7 @@ std::vector<std::pair<int, int>> lyndon_factorization(const std::string &s) {
4144
// - `teletelepathy` -> [1,4,1,2,1,4,1,2,1,4,1,2,1]
4245
// Reference:
4346
// [1] H. Bannai et al., "The "Runs" Theorem,"
44-
// SIAM Journal on Computing, 46.5, 1501-1514, 2017.
47+
// SIAM Journal on Computing, 46(5), 1501-1514, 2017.
4548
template <typename String, typename LCPLENCallable>
4649
std::vector<int> longest_lyndon_prefixes(const String &s, const LCPLENCallable &lcp) {
4750
const int N = s.size();
@@ -67,8 +70,9 @@ std::vector<int> longest_lyndon_prefixes(const String &s, const LCPLENCallable &
6770
// N = 2e5 -> ~120 ms
6871
// Reference:
6972
// [1] H. Bannai et al., "The "Runs" Theorem,"
70-
// SIAM Journal on Computing, 46.5, 1501-1514, 2017.
71-
template <typename LCPLENCallable, typename String> std::vector<std::tuple<int, int, int>> run_enumerate(String s) {
73+
// SIAM Journal on Computing, 46(5), 1501-1514, 2017.
74+
template <typename LCPLENCallable, typename String>
75+
std::vector<std::tuple<int, int, int>> run_enumerate(String s) {
7276
if (s.empty()) return {};
7377
LCPLENCallable lcp(s);
7478
std::reverse(s.begin(), s.end());
@@ -93,3 +97,30 @@ template <typename LCPLENCallable, typename String> std::vector<std::tuple<int,
9397
ret.erase(std::unique(ret.begin(), ret.end()), ret.end());
9498
return ret;
9599
}
100+
101+
// Enumerate Lyndon words up to length n in lexical order
102+
// https://github.com/bqi343/USACO/blob/master/Implementations/content/combinatorial%20(11.2)/DeBruijnSeq.h
103+
// Example: k=2, n=4 => [[0,],[0,0,0,1,],[0,0,1,],[0,0,1,1,],[0,1,],[0,1,1,],[0,1,1,1,],[1,],]
104+
// Verified: https://codeforces.com/gym/102001/problem/C / https://codeforces.com/gym/100162/problem/G
105+
std::vector<std::vector<int>> enumerate_lyndon_words(int k, int n) {
106+
assert(k > 0);
107+
assert(n > 0);
108+
std::vector<std::vector<int>> ret;
109+
std::vector<int> aux(n + 1);
110+
111+
std::function<void(int, int)> gen = [&](int t, int p) {
112+
// t: current length
113+
// p: current min cycle length
114+
if (t == n) {
115+
std::vector<int> tmp(aux.begin() + 1, aux.begin() + p + 1);
116+
ret.push_back(std::move(tmp));
117+
} else {
118+
++t;
119+
aux[t] = aux[t - p];
120+
gen(t, p);
121+
while (++aux[t] < k) gen(t, t);
122+
}
123+
};
124+
gen(0, 1);
125+
return ret;
126+
}

string/lyndon.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
title: Lyndon words (Lyndon 文字列に関する各種関数)
3+
documentation_of: ./lyndon.hpp
4+
---
5+
6+
文字列・数列などの比較可能なものの列に対して Lyndon 分解を行う関数や,Lyndon 列の列挙など.
7+
8+
## Lyndon 文字列
9+
10+
- $S$ が Lyndon 文字列であるとは,$S$ の(非空な)接尾辞の中で $S$ 自身が辞書順最小であること
11+
12+
## Lyndon 分解
13+
14+
- (定義)Lyndon 分解とは,文字列 $S$ の分割 $S = w_1 w_2 \dots w_M$ で,各 $w_i$ は Lyndon 文字列で,しかも $w_i$ たちが辞書順非増加であるもの.
15+
- (一意性)Lyndon 分解は一意.
16+
- (構成)$w_1$ は,$S$ の接頭辞で Lyndon であるような最長のもの.
17+
18+
## 実装されている関数
19+
20+
### lyndon_factorization
21+
22+
```cpp
23+
vector<pair<int, int>> ret = lyndon_factorization(s);
24+
```
25+
26+
- 文字列など比較可能な要素の列 `s` を Lyndon 分解した際の,(先頭位置,長さ)の組の列を出力
27+
- 計算量 $O(N)$
28+
29+
### longest_lyndon_prefixes
30+
31+
``` cpp
32+
vector<int> ret = longest_lyndon_prefixes(s, LCPLEN_Callable_obj);
33+
```
34+
35+
- 各 suffix `s[i:N)` に関する最長な Lyndon prefix `s[i:i+len(i))` の長さ `len(i)` を格納した配列を出力
36+
- 計算量 $O(NL)$
37+
- $L$ は `s[i:)``s[j:)` の longest common prefix 長の計算一回に要する計算量
38+
39+
### run_enumerate
40+
41+
``` cpp
42+
ret = run_enumerate<LCPLEN_Callable>(s);
43+
```
44+
45+
- 各 run について `(c, l, r)``s[l:r)` が最小周期 `c`,2 周期以上)を全列挙
46+
- 計算量 $O(NL)$
47+
48+
### enumerate_lyndon_words
49+
50+
```cpp
51+
int k, n;
52+
vector<vector<int>> seqs = enumerate_lyndon_words(k, n);
53+
```
54+
55+
- 各要素が $0, \dots, (k - 1)$ で長さ $n$ 以下の Lyndon 列を辞書順に全列挙する.
56+
- これを応用すると,「[de Bruijn sequence - Wikipedia](https://en.wikipedia.org/wiki/De_Bruijn_sequence)」 $B(k, n)$ が構築できる.
57+
- de Bruijn sequence $B(k, n)$ とは,各要素が $0$ 以上 $k - 1$ 以下の整数からなる長さ $k^n$ の列で,その長さ $n$ の全ての連続部分列 $k^n$ 個(端は周期的に見る)が互いに相異なるもの.
58+
- $B(k, n)$ のうち特に辞書順最小のものは,「各要素が $0$ 以上 $k - 1$ 以下の整数で長さが $n$ の約数であるような Lyndon 列」全てを辞書順に並べて結合することで構築できることが知られている.
59+
60+
## 問題例
61+
62+
- [Run Enumerate - Library Checker](https://judge.yosupo.jp/problem/runenumerate)
63+
- [2012-2013 Petrozavodsk Winter Training Camp, Saratov SU Contest G. Lyndon Words - Codeforces](https://codeforces.com/gym/100162/problem/G) Lyndon words を列挙する.
64+
- [2018 ICPC Asia Jakarta Regional Contest C. Smart Thief - Codeforces](https://codeforces.com/gym/102001/problem/C) de Bruijn sequence を構築する.
65+
66+
## 参考文献・リンク
67+
68+
- [1] K. T. Chen, R. H. Fox, R. C. Lyndon,
69+
"Free Differential Calculus, IV. The Quotient Groups of the Lower Central Series,"
70+
Annals of Mathematics, 68(1), 81-95, 1958.
71+
- [2] J. P. Duval, "Factorizing words over an ordered alphabet,"
72+
Journal of Algorithms, 4(4), 363-381, 1983.
73+
- [3] H. Bannai et al., "The "Runs" Theorem,"
74+
SIAM Journal on Computing, 46(5), 1501-1514, 2017.
75+
- [Lyndon factorization - Competitive Programming Algorithms](https://cp-algorithms.com/string/lyndon_factorization.html)
76+
- [Lyndon 文字列入門 - Qiita](https://qiita.com/nakashi18/items/66882bd6e0127174267a)

string/lyndon_factorization.md

Lines changed: 0 additions & 46 deletions
This file was deleted.

string/test/run_enumerate_lyndon_hash.test.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#include "../lyndon_factorization.hpp"
1+
#include "../lyndon.hpp"
22
#include "../rolling_hash_1d.hpp"
33
#include <iostream>
44
#include <string>

string/test/run_enumerate_lyndon_rmq.test.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
#include "../longest_common_prefix.hpp"
2-
#include "../lyndon_factorization.hpp"
2+
#include "../lyndon.hpp"
33
#include <iostream>
44
#include <string>
55
#define PROBLEM "https://judge.yosupo.jp/problem/runenumerate"

0 commit comments

Comments
 (0)