std::regex_token_iterator

ヘッダ `<regex>` で定義
template< class BidirIt, class CharT = typename std::iterator_traits<BidirIt>::value_type, class Traits = std::regex_traits<CharT> > class regex_token_iterator		(C++11以降)

std::regex_token_iterator は、正規表現による基底の文字シーケンス内の各マッチにおける個々のサブマッチにアクセスする、読み取り専用の LegacyForwardIterator です。また、指定された正規表現によってマッチしなかったシーケンスの部分（例: トークナイザーとして）にアクセスするためにも使用できます。

構築時に、std::regex_iterator を構築し、各インクリメントで現在の `match_results` から要求されたサブマッチをステップし、最後のサブマッチから離れてインクリメントする際に基底の std::regex_iterator をインクリメントします。

デフォルト構築された std::regex_token_iterator は、シーケンスの終端イテレーターです。有効な std::regex_token_iterator が、最後のマッチの最後のサブマッチに到達した後にインクリメントされると、シーケンスの終端イテレーターと等しくなります。それをさらに逆参照またはインクリメントすると、未定義の動作が発生します。

シーケンスの終端イテレーターになる直前に、要求されたサブマッチインデックスのリストにインデックス -1（マッチしなかったフラグメント）が含まれている場合、std::regex_token_iterator は *サフィックスイテレーター* になることがあります。そのようなイテレーターは、逆参照されると、最後のマッチとシーケンスの終端の間の文字シーケンスに対応する `match_results` を返します。

std::regex_token_iterator の典型的な実装は、基底の std::regex_iterator、要求されたサブマッチインデックスのコンテナ（例: std::vector<int>）、サブマッチのインデックスに等しい内部カウンター、現在のマッチの現在のサブマッチを指す std::sub_match へのポインター、および最後のマッチしなかった文字シーケンスを含む（トークナイザーモードで使用される）std::match_results オブジェクトを保持します。

ヘッダ `<regex>` で定義
型	定義
`std::cregex_token_iterator`	std::regex_token_iterator<const char*>
`std::wcregex_token_iterator`	std::regex_token_iterator<const wchar_t*>
`std::sregex_token_iterator`	std::regex_token_iterator<std::string::const_iterator>
`std::wsregex_token_iterator`	std::regex_token_iterator<std::wstring::const_iterator>

[編集] メンバ型

メンバ型	定義
`value_type`	std::sub_match<BidirIt>
`difference_type`	std::ptrdiff_t
`pointer`	const value_type*
`reference`	const value_type&
`iterator_category`	std::forward_iterator_tag
`iterator_concept` (C++20)	std::input_iterator_tag
`regex_type`	std::basic_regex<CharT, Traits>

[編集] メンバ関数

(コンストラクタ)	新しい `regex_token_iterator` を構築します。 (public member function) [編集]
(デストラクタ) (暗黙的に宣言)	キャッシュされた値を含む `regex_token_iterator` を破棄します。 (public member function) [編集]
operator=	内容を代入する (public member function) [編集]
operator==operator!= (C++20で削除)	2つの `regex_token_iterator` を比較します。 (public member function) [編集]
operator*operator->	現在のサブマッチにアクセスします。 (public member function) [編集]
operator++operator++(int)	イテレーターを次のサブマッチに進めます。 (public member function) [編集]

[編集] 注記

イテレーターのコンストラクタに渡された std::basic_regex オブジェクトがイテレーターよりも長く生存することを保証するのはプログラマの責任です。イテレーターは正規表現へのポインタを格納する std::regex_iterator を格納するため、正規表現が破棄された後にイテレーターをインクリメントすると未定義の動作が発生します。

[編集] 例

このコードを実行

#include <algorithm>
#include <fstream>
#include <iostream>
#include <iterator>
#include <regex>
 
int main()
{
    // Tokenization (non-matched fragments)
    // Note that regex is matched only two times; when the third value is obtained
    // the iterator is a suffix iterator.
    const std::string text = "Quick brown fox.";
    const std::regex ws_re("\\s+"); // whitespace
    std::copy(std::sregex_token_iterator(text.begin(), text.end(), ws_re, -1),
              std::sregex_token_iterator(),
              std::ostream_iterator<std::string>(std::cout, "\n"));
 
    std::cout << '\n';
 
    // Iterating the first submatches
    const std::string html = R"(<p><a href="http://google.com">google</a> )"
                             R"(< a HREF ="http://ja.cppreference.dev">cppreference</a>\n</p>)";
    const std::regex url_re(R"!!(<\s*A\s+[^>]*href\s*=\s*"([^"]*)")!!", std::regex::icase);
    std::copy(std::sregex_token_iterator(html.begin(), html.end(), url_re, 1),
              std::sregex_token_iterator(),
              std::ostream_iterator<std::string>(std::cout, "\n"));
}

出力

Quick
brown
fox.
 
http://google.com
https://ja.cppreference.dev

[編集] 欠陥報告

以下の動作変更を伴う欠陥報告が、以前に公開されたC++標準に遡って適用されました。

DR	適用対象	公開された動作	正しい動作
LWG 3698 (P2770R0)	C++20	`regex_token_iterator` は `forward_iterator` でした。 stashing iterator であったにもかかわらず。	`input_iterator` になりました。^[1]

↑ iterator_category は、std::input_iterator_tag への変更は既存のコードを壊しすぎる可能性があるため、解決によって変更されませんでした。

[1] iterator_category は、std::input_iterator_tag への変更は既存のコードを壊しすぎる可能性があるため、解決によって変更されませんでした。

[1]

コンパイラサポート
フリースタンディングとホスト
言語
標準ライブラリ
標準ライブラリヘッダー
名前付き要件
機能テストマクロ (C++20)
言語サポートライブラリ
コンセプトライブラリ (C++20)
診断ライブラリ
メモリ管理ライブラリ
メタプログラミングライブラリ (C++11)
汎用ユーティリティライブラリ
コンテナライブラリ
イテレータライブラリ
Rangesライブラリ (C++20)
アルゴリズムライブラリ
文字列ライブラリ
テキスト処理ライブラリ
数値ライブラリ
日付と時刻ライブラリ
入出力ライブラリ
ファイルシステムライブラリ (C++17)
並行サポートライブラリ (C++11)
実行制御ライブラリ (C++26)
Technical specifications (技術仕様)
シンボルインデックス
外部ライブラリ

クラス
basic_regex (C++11)
sub_match (C++11)
match_results (C++11)
アルゴリズム
regex_match (C++11)
regex_search (C++11)
regex_replace (C++11)
イテレータ
regex_iterator (C++11)
regex_token_iterator (C++11)
例外
regex_error (C++11)
Traits
regex_traits (C++11)
定数
syntax_option_type (C++11)
match_flag_type (C++11)
error_type (C++11)
正規表現文法
修正ECMAScript-262 (C++11)

メンバ関数
regex_token_iterator::regex_token_iterator
regex_token_iterator::operator=
比較
regex_token_iterator::operator==regex_token_iterator::operator!= (C++20まで)
監視
regex_token_iterator::operator*regex_token_iterator::operator->
変更
regex_token_iterator::operator++regex_token_iterator::operator++(int)

cppreference.com

名前空間

変種

表示

操作

std::regex_token_iterator

目次

[編集] 型要件

[編集] 特殊化

[編集] メンバ型

[編集] メンバ関数

[編集] 注記

[編集] 例

[編集] 欠陥報告

ナビゲーション

ツールボックス