CWB
Macros | Functions | Variables
regopt.c File Reference

The CL_Regex object, and the CL Regular Expression Optimiser. More...

#include <glib.h>
#include "globals.h"
#include "regopt.h"

Macros

#define PCRE_STUDY_JIT_COMPILE   0
 

Functions

int cl_regopt_analyse (char *regex)
 Analyses a regular expression and tries to find the best set of grains. More...
 
CL_Regex cl_new_regex (char *regex, int flags, CorpusCharset charset)
 Create a new CL_regex object (ie a regular expression buffer). More...
 
int cl_regex_optimised (CL_Regex rx)
 Finds the level of optimisation of a CL_Regex. More...
 
int cl_regex_match (CL_Regex rx, char *str, int normalize_utf8)
 Matches a regular expression against a string. More...
 
void cl_delete_regex (CL_Regex rx)
 Deletes a CL_Regex object, and frees all resources associated with the pre-compiled regex. More...
 
int is_safe_char (unsigned char c)
 Is the given character a 'safe' character which will only match itself in a regex? More...
 
int is_ascii_alnum (unsigned char c)
 Is the given character an ASCII alphanumeric? More...
 
int is_ascii_punct (unsigned char c)
 Is the given character ASCII punctuation? More...
 
int is_hexadecimal (unsigned char c)
 Is the given character a valid hexadecimal digit? More...
 
char * read_escape_seq (char *mark)
 Read in an escape sequence for a character or class - part of the CL Regex Optimiser. More...
 
char * read_matchall (char *mark)
 Reads in an element matching some character - part of the CL Regex Optimiser. More...
 
char * read_kleene (char *mark, int *one_or_more)
 Reads in a repetition operator - part of the CL Regex Optimiser. More...
 
char * read_wildcard (char *mark)
 Reads in a wildcard - part of the CL Regex Optimiser. More...
 
char * read_grain (char *mark, char *grain, int *len)
 Reads in a literal grain from a regex - part of the CL Regex Optimiser. More...
 
char * read_disjunction (char *mark, int *align_start, int *align_end, int no_paren)
 Finds grains in a simple disjunction group - part of the CL Regex Optimiser. More...
 
void update_grain_buffer (int at_start, int at_end)
 Updates the public grain buffer – part of the CL Regex Optimiser. More...
 
void make_jump_table (CL_Regex rx)
 Computes a jump table for Boyer-Moore searches – part of the CL Regex Optimiser. More...
 
void regopt_data_copy_to_regex_object (CL_Regex rx)
 Copy grain set from internal global variables to CL_Regex object – part of the CL Regex Optimiser. More...
 
void cl_regopt_count_reset (void)
 Reset the "success counter" for optimised regexes. More...
 
int cl_regopt_count_get (void)
 Get a reading from the "success counter" for optimised regexes. More...
 

Variables

char * cl_regopt_grain [MAX_GRAINS]
 list of 'grains' (any matching string must contain one of these) More...
 
int cl_regopt_grain_len
 length of shortest grain (in characters) More...
 
int cl_regopt_grains
 number of grains More...
 
int cl_regopt_anchor_start
 Boolean: whether grains are anchored at beginning of string. More...
 
int cl_regopt_anchor_end
 Boolean: whether grains are anchored at end of string. More...
 
int cl_regopt_utf8
 whether the regular expression is in UTF-8 encoding (set before calling cl_regopt_analyse()) More...
 
char * grain_buffer [MAX_GRAINS]
 Intermediate buffer for grains. More...
 
int grain_buffer_len [MAX_GRAINS]
 the length of each grain (in characters) More...
 
int grain_buffer_grains = 0
 The number of grains currently in the intermediate buffer. More...
 
char public_grain_data [CL_MAX_LINE_LENGTH]
 A buffer for grain strings. More...
 
char local_grain_data [CL_MAX_LINE_LENGTH]
 A buffer for grain strings. More...
 
int cl_regopt_successes = 0
 A counter of how many times the "grain" system has allwoed us to avoid calling the regex engine. More...
 
char cl_regex_error [CL_MAX_LINE_LENGTH]
 The error message from (PCRE) regex compilation are placed in this buffer if cl_new_regex() fails. More...
 

Detailed Description

The CL_Regex object, and the CL Regular Expression Optimiser.

This is the CL front-end to POSIX regular expressions with CL semantics (most notably: CL regexes always match the entire string and NOT substrings.)

Note that the optimiser is handled automatically by the CL_Regex object.

All variables / functions containing "regopt" are internal to this module and are not exported in the CL API.

Optimisation is done by means of "grains". The grain array in a CL_Regex object is a list of short strings. Any string which will match the regex must contain at least one of these. Thus, the grains provide a quick way of filtering out strings that definitely WON'T match, and avoiding a time-wasting call to the POSIX regex matching function.

While a regex is being optimised, the grains are stored in non-exported global variables in this module. Subsequently they are transferred to members of the CL_regex object with which they are associated. The use of global variables and a fixed-size buffer for grains is partly due to historical reasons, but it does also serve to reduce memory allocation overhead.

Macro Definition Documentation

#define PCRE_STUDY_JIT_COMPILE   0

Referenced by cl_new_regex().

Function Documentation

void cl_delete_regex ( CL_Regex  rx)

Deletes a CL_Regex object, and frees all resources associated with the pre-compiled regex.

Note that we use cl_free to deallocate the internal PCRE buffers, not pcre_free, for the simple reason that pcre_free is just a function pointer that will normally contain free, and thus we miss out on the checking that cl_free provides.

Parameters
rxThe CL_Regex to delete.

References cl_free, _CL_Regex::extra, _CL_Regex::grain, _CL_Regex::grains, _CL_Regex::haystack_buf, _CL_Regex::haystack_casefold, and _CL_Regex::needle.

Referenced by cl_regex2id(), free_booltree(), free_environment(), and main().

CL_Regex cl_new_regex ( char *  regex,
int  flags,
CorpusCharset  charset 
)

Create a new CL_regex object (ie a regular expression buffer).

This function compiles the regular expression according to the specified flags (IGNORE_CASE and/or IGNORE_DIAC and/or REQUIRE_NFC) and for the specified character encoding. The regex is automatically anchored to the start and end of the string (i.e. wrapped in ^(?:...)$).

The regular expression engine used is PCRE. However, the regex is optimized by scanning it for literal strings ("grains") that must be contained in any match; the grains can be used as a fast pre-filter (using Boyer-Moore search for the grains).

The optimizer only understands a subset of PCRE syntax:

  • literal characters (alphanumeric, safe punctuation, escaped punctuation)
  • numeric character codes ( and )
  • escape sequences for character classes and Unicode properties
  • all repetition operators
  • simple alternatives (...|...|...)
  • nested capturing (...) and non-capturing (?:...) groups Any regexp that contains other syntactic elements such as
  • character sets [...]
  • named groups, look-ahead and look-behind patterns, etc.
  • backreferences
  • modifiers such as (?i) cannot be parsed and optimized. Note that even if a regexp is parsed by the optimizer, it might not be able to extract all grains (because grain recognition uses an even more restrictive syntax).

The optimizer is always disabled with IGNORE_DIAC if either PCRE JIT is available or the charset is UTF-8. Testing has showed that in these cases the overhead from case-folding each input string outweighs the benefits of the optimizer.

Parameters
regexString containing the regular expression
flagsIGNORE_CASE, or IGNORE_DIAC, or both, or 0.
charsetThe character set of the regex.
Returns
The new CL_Regex object, or NULL in case of error.

References CDA_EBADREGEX, CDA_OK, _CL_Regex::charset, charset, cl_debug, cl_errno, cl_free, cl_malloc(), CL_MAX_LINE_LENGTH, cl_regex_error, cl_regopt_analyse(), cl_regopt_grain, cl_regopt_grains, cl_regopt_utf8, cl_string_canonical(), cl_string_latex2iso(), _CL_Regex::extra, _CL_Regex::grains, _CL_Regex::haystack_buf, _CL_Regex::haystack_casefold, _CL_Regex::icase, _CL_Regex::idiac, IGNORE_CASE, IGNORE_DIAC, _CL_Regex::needle, PCRE_STUDY_JIT_COMPILE, regopt_data_copy_to_regex_object(), REQUIRE_NFC, and utf8.

Referenced by cl_regex2id(), do_flagged_string(), do_XMLTag(), main(), and scancorpus_add_key().

int cl_regex_match ( CL_Regex  rx,
char *  str,
int  normalize_utf8 
)

Matches a regular expression against a string.

The pre-compiled regular expression contained in the CL_Regex is compared to the string. This regex automatically uses the case/accent folding flags and character encoding that were specified when the CL_Regex constructor was called.

If the subject string is a UTF-8 string from an external sources, the caller can request enforcement of the subject to canonical NFC form by setting the third argument to true.

See also
cl_new_regex
Parameters
rxThe regular expression to match.
strThe subject (the string to compare the regex to).
normalize_utf8If a UTF-8 string from an external source is passed as the subject, set to this parameter to true, and the function will make sure that the comparison is based on the canonical NFC form. For known-NFC strings, this parameter should be false. If the regex is not UTF-8, this parameter is ignored.
Returns
Boolean: true if the regex matched, otherwise false.

References _CL_Regex::anchor_end, _CL_Regex::anchor_start, _CL_Regex::charset, cl_debug, cl_optimize, cl_regopt_successes, cl_string_canonical(), _CL_Regex::extra, _CL_Regex::grain, _CL_Regex::grain_len, _CL_Regex::grains, _CL_Regex::haystack_buf, _CL_Regex::haystack_casefold, _CL_Regex::icase, _CL_Regex::idiac, _CL_Regex::jumptable, _CL_Regex::needle, REQUIRE_NFC, and utf8.

Referenced by cl_regex2id(), eval_bool(), eval_constraint(), main(), matchfirstpattern(), and scancorpus_word_is_regular().

int cl_regex_optimised ( CL_Regex  rx)

Finds the level of optimisation of a CL_Regex.

This function returns the approximate level of optimisation, computed from the ratio of grain length to number of grains (0 = no grains, ergo not optimised at all).

Parameters
rxThe CL_Regex to check.
Returns
0 if rx is not optimised; otherwise an integer indicating optimisation level.

References _CL_Regex::grain_len, and _CL_Regex::grains.

Referenced by cl_regex2id().

int cl_regopt_analyse ( char *  regex)

Analyses a regular expression and tries to find the best set of grains.

Part of the regex optimiser. For a given regular expression, this function will try to extract a set of grains from regular expression {regex_string}. These grains are then used by the CL regex matcher and cl_regex2id() as a pre-filter for faster regular expression search.

The function only recognizes relatively simple regular expressions without recursive nesting, which roughly correspond to wildcard searches similar to SQL's LIKE operator. Searching for a fixed prefix, suffix or infix (or a small set of alternatives) will see the most noticeable performance improvement. Any more complex expression is passed directly to the standard PCRE evaluation. See cl_new_regex() for an overview of the supported subsets of PCRE syntax.

If successful, this function returns True and stores the grains in the optiomiser's global variables above, from which they then must be copied to a CL_Regex object with regopt_data_copy_to_regex_object(). This step will also casefold grains if necessary and prepare the Boyer-Moore jump tables.

Usage: optimised = cl_regopt_analyse(regex_string);

This is a non-exported function.

Parameters
regexString containing the regex to optimise.
Returns
Boolean: true = ok, false = couldn't optimise regex.

The code below parses a regular expression using the supported PCRE subset and attempts to extract a literal string ("grain") that must occur in every match of the regular expression. In a disjunction group, each alternative must contain a grain in order to be used for optimized matching; we refer to this as a complete grain set.

The algorithm scans the regular expression from left to right, looking for the following supported elements in the specified order:

  1. a literal grain (read_grain)
  2. a parenthesized disjunction containing grains (read_disjunction)
  3. a segment matching some, possibly empty substring (read_wildcard), including recursively nested capturing or non-capturing groups

See the respective functions for further information on these elements. Grains are collected by read_grain and read_disjunction in a local buffer. If a complete grain set has been found, it replaces the previous global grain set if it is considered to be more effective based on the length and number of grains.

As a special case, the optimizer first attempts to match the entire regexp as a simple disjunction (read_disjunction) without parentheses.

References cl_debug, cl_regopt_anchor_end, cl_regopt_anchor_start, cl_regopt_grain_len, cl_regopt_grains, grain_buffer, grain_buffer_grains, grain_buffer_len, local_grain_data, read_disjunction(), read_grain(), read_kleene(), read_wildcard(), and update_grain_buffer().

Referenced by cl_new_regex().

int cl_regopt_count_get ( void  )

Get a reading from the "success counter" for optimised regexes.

The counter is incremented by 1 every time the "grain" system is used successfully to avoid calling PCRE. That is, it is incremented every time a string is scrutinised and found to contain none of the grains.

Usage:

cl_regopt_count_reset();

for (i = 0, hits = 0; i < n; i++) if (cl_regex_match(rx, haystacks[i])) hits++;

fprintf(stderr, "Found %d matches; avoided regex matching %d times out of %d trials", hits, cl_regopt_count_get(), n );

See also
cl_regopt_count_reset
Returns
an integer indicating the number of times a regular expression has been matched using the regopt system of "grains", rather than by calling an external regex library.

References cl_regopt_successes.

Referenced by cl_regex2id().

void cl_regopt_count_reset ( void  )

Reset the "success counter" for optimised regexes.

References cl_regopt_successes.

Referenced by cl_regex2id().

int is_ascii_alnum ( unsigned char  c)

Is the given character an ASCII alphanumeric?

ASCII alphanumeric characters comprise A-Z, a-z and 0-9; they are the only characters that form special escape sequences in PCRE regular expressions.

Parameters
cThe character (cast to unsigned for the comparison).
Returns
True if ASCII alphanumeric; false otherwise.

Referenced by read_escape_seq().

int is_ascii_punct ( unsigned char  c)

Is the given character ASCII punctuation?

ASCII punctuation symbols are the only characters that may need to be protected by a \ in regular expressions. They cannot form special escape sequences.

Parameters
cThe character (cast to unsigned for the comparison).
Returns
True if ASCII alphanumeric; false otherwise.

Referenced by read_grain().

int is_hexadecimal ( unsigned char  c)

Is the given character a valid hexadecimal digit?

Parameters
cThe character (cast to unsigned for the comparison).
Returns
True if valid hexidecimal digit; false otherwise.

Referenced by read_escape_seq().

int is_safe_char ( unsigned char  c)

Is the given character a 'safe' character which will only match itself in a regex?

What counts as safe: A to Z, a to z, 0 to 9, minus, quote marks, percent, ampersand, slash, excl mark, colon, semi colon, character, underscore, tilde, any values above 0x7f (ISO 8859 extension or UTF-8 non-ASCII character).

What counts as not safe therefore includes: brackets, braces, square brackets; questionmark, plus, and star; circumflex and dollar sign; dot; hash; backslash, etc. (But, in UTF8, Unicode PUNC area equivalents of these characters will be safe.)

A safe character can never be the start of a meta element (even if it might appear as part of one), so it's safe to include in a literal grain.

Parameters
cThe character (cast to unsigned for the comparison).
Returns
True for non-special characters; false for special characters.

Referenced by read_grain(), and read_matchall().

void make_jump_table ( CL_Regex  rx)

Computes a jump table for Boyer-Moore searches – part of the CL Regex Optimiser.

Unlike the textbook version, this jumptable includes the last character of each grain (so we don't have to start the string comparison loops every time).

A non-exported function.

References cl_debug, _CL_Regex::grain, _CL_Regex::grain_len, _CL_Regex::grains, and _CL_Regex::jumptable.

Referenced by regopt_data_copy_to_regex_object().

char* read_disjunction ( char *  mark,
int *  align_start,
int *  align_end,
int  no_paren 
)

Finds grains in a simple disjunction group - part of the CL Regex Optimiser.

This function parses a simple parenthesized disjunction within a regular expression and attempts to extract one grain from each alternative. Grains are written to the local grain buffer. If a complete grain set has been found, the functions returns a pointer to the first character after the disjunction. Otherwise it returns mark and the caller can try to accept the group with read_matchall.

For simplicity, only the first grain in each alternative is considered. This makes it easier to check start/end alignment of the grains. Note that the local grain buffer is always mangled, even if the function is unsuccessful.

The first argument, mark, must point to the '(' at the beginning of the disjunction group (unless no_paren is set).

The booleans align_start and align_end are set to true if the grains from all alternatives are anchored at the start or end of the disjunction group, respectively.

This is a non-exported function.

Parameters
markPointer to the disjunction group (see also function description).
align_startSee function description.
align_endSee function description.
no_parenAttempt to read a top-level disjunction without parentheses, which must extend to the end of the string.
Returns
A pointer to first character after the disjunction group iff the parse succeeded, the original pointer in the mark argument otherwise.

References buf, grain_buffer, grain_buffer_grains, grain_buffer_len, local_grain_data, MAX_GRAINS, read_grain(), and read_wildcard().

Referenced by cl_regopt_analyse().

char* read_escape_seq ( char *  mark)

Read in an escape sequence for a character or class - part of the CL Regex Optimiser.

This function reads one of the following escape sequences: ##, {###} ... hexadecimal character code {###} ... octal character code , , , , , ... generic character types #, {###} ... Unicode properties #, {###} ... negated Unicode properties ... Unicode extended grapheme cluster

References is_ascii_alnum(), and is_hexadecimal().

Referenced by read_matchall().

char* read_grain ( char *  mark,
char *  grain,
int *  len 
)

Reads in a literal grain from a regex - part of the CL Regex Optimiser.

A grain is a string of safe symbols: alphanumeric, safe punctuation and escaped punctuation; numeric character codes (, ) are not supported. The last symbol might be followed by a repetition operator. It is only included in the grain if the repetition count is at least one.

This function finds the longest grain it can starting at the point in the regex indicated by mark. If *grain is not NULL, the grain data are unescaped and copied to the specified buffer. Note that the buffer will be mangled even if no grain is matched; it is guaranteed to contain a NUL-terminated string, though. If *len is not null, the length of the grain (in characters) is stored there.

Parameters
markPointer to location in the regex string from which to read.
grainOptional pointer to a buffer into which the grain data will be copied. Guaranteed to contain a NUL-terminated string even if no grain is found.
lenPointer to integer in which length of grain in characters will be stored.
Returns
Pointer to the first character after the grain it has read in (or the original "mark" pointer if no grain is found).

References cl_regopt_utf8, is_ascii_punct(), is_safe_char(), and read_kleene().

Referenced by cl_regopt_analyse(), and read_disjunction().

char* read_kleene ( char *  mark,
int *  one_or_more 
)

Reads in a repetition operator - part of the CL Regex Optimiser.

This function reads in any repetition operator allowed in PCRE syntax:

  • * (Kleene star), ?, +
  • {n}, {n,m}, {,m}, {n,}
  • optionally followed by a non-greedy (?) or possessive (+) modifier it returns a pointer to the first character after the repetition modifier it has found. If *one_or_more is not null, it is set to True if the quantifier operator mandates at least one repetition (i.e. it does not make the preceding element optional).
Parameters
markPointer to location in the regex string from which to read.
one_or_moreOptional pointer to integer, which will be set to True if repetition is not optional.
Returns
Pointer to the first character after the star or other modifier it has read in (or the original "mark" pointer if a repetition modifier was not read).

Referenced by cl_regopt_analyse(), read_grain(), and read_wildcard().

char* read_matchall ( char *  mark)

Reads in an element matching some character - part of the CL Regex Optimiser.

This function reads in an element known to match some character. The following elements are currently recognized:

  • a matchall (.)
  • a safe literal character or escaped ASCII punctuation
  • an escape sequence for a simple character class (, , etc.)
  • an escape sequence for a Unicode property
  • a hexadecimal () or octal () character code
  • a simple character set such as [a-z], [A-Z] and [0-9] The precise syntax of character sets is rather messy, so we will not make an attempt to recognize more complex sets.
Parameters
markPointer to location in the regex string from which to read.
Returns
Pointer to the first character after the character (class) it has read in (or the original "mark" pointer if nothing suitable was found).

References cl_regopt_utf8, is_safe_char(), and read_escape_seq().

Referenced by read_wildcard().

char* read_wildcard ( char *  mark)

Reads in a wildcard - part of the CL Regex Optimiser.

This function reads in a wildcard segment consisting of an element matching some character (read_matchall) or a capturing or non-capturing group, followed by an optional quantifier (read_kleene). It returns a pointer to the first character after the wildcard segment.

Groups are parsed recursively and must consist of one or more alternatives containing only valid wildcard segments.

Parameters
markPointer to location in the regex string from which to read.
Returns
Pointer to the first character after the wildcard segment (or the original "mark" pointer if a wildcard segment was not found).

References read_kleene(), and read_matchall().

Referenced by cl_regopt_analyse(), and read_disjunction().

void regopt_data_copy_to_regex_object ( CL_Regex  rx)

Copy grain set from internal global variables to CL_Regex object – part of the CL Regex Optimiser.

This function carries out four important tasks:

  • It copies the grain set found by cl_regopt_analyse from the internal global buffer to member variables of the specified CL_Regex object.
  • It casefolds the grains if required (i.e. if rx->icase is True).
  • It cuts grains to equal byte length, preserving start/end anchoring if possible. Note that our implementation of Boyer-Moore search requires equally sized grains.
  • It calls make_jump_table to generate a lookup table for the Boyer-Moore search.

This is a non-exported function.

Parameters
rxa pointer to an initialized CL_Regex object

References _CL_Regex::anchor_end, _CL_Regex::anchor_start, _CL_Regex::charset, cl_debug, CL_MAX_LINE_LENGTH, cl_regopt_anchor_end, cl_regopt_anchor_start, cl_regopt_grain, cl_regopt_grains, cl_strdup(), cl_string_canonical(), _CL_Regex::grain, _CL_Regex::grain_len, _CL_Regex::grains, _CL_Regex::icase, IGNORE_CASE, local_grain_data, and make_jump_table().

Referenced by cl_new_regex().

void update_grain_buffer ( int  at_start,
int  at_end 
)

Updates the public grain buffer – part of the CL Regex Optimiser.

This function copies the local grains to the public buffer, if they are better than the set of grains currently there. The decision is made with a heuristic based on the character length of the shortest grain and the number of different grains.

A non-exported function.

Parameters
at_startBoolean: if True, all grains are anchored on the left
at_endBoolean: if True, all grains are anchored on the right

References buf, cl_regopt_anchor_end, cl_regopt_anchor_start, cl_regopt_grain, cl_regopt_grain_len, cl_regopt_grains, grain_buffer, grain_buffer_grains, grain_buffer_len, and public_grain_data.

Referenced by cl_regopt_analyse().

Variable Documentation

char cl_regex_error[CL_MAX_LINE_LENGTH]

The error message from (PCRE) regex compilation are placed in this buffer if cl_new_regex() fails.

This global variable is part of the CL_Regex object's API.

Referenced by cl_new_regex(), and cl_regex2id().

int cl_regopt_anchor_end

Boolean: whether grains are anchored at end of string.

Referenced by cl_regopt_analyse(), regopt_data_copy_to_regex_object(), and update_grain_buffer().

int cl_regopt_anchor_start

Boolean: whether grains are anchored at beginning of string.

Referenced by cl_regopt_analyse(), regopt_data_copy_to_regex_object(), and update_grain_buffer().

char* cl_regopt_grain[MAX_GRAINS]

list of 'grains' (any matching string must contain one of these)

Referenced by cl_new_regex(), regopt_data_copy_to_regex_object(), and update_grain_buffer().

int cl_regopt_grain_len

length of shortest grain (in characters)

Referenced by cl_regopt_analyse(), and update_grain_buffer().

int cl_regopt_grains
int cl_regopt_successes = 0

A counter of how many times the "grain" system has allwoed us to avoid calling the regex engine.

See also
cl_regopt_count_get

Referenced by cl_regex_match(), cl_regopt_count_get(), and cl_regopt_count_reset().

int cl_regopt_utf8

whether the regular expression is in UTF-8 encoding (set before calling cl_regopt_analyse())

Referenced by cl_new_regex(), read_grain(), and read_matchall().

char* grain_buffer[MAX_GRAINS]

Intermediate buffer for grains.

When a regex is parsed, grains for each segment are written to this intermediate buffer; if the new set of grains is better than the current one, it is copied to the cl_regopt_ variables.grains in the local buffer may have different lengths

Referenced by cl_regopt_analyse(), read_disjunction(), and update_grain_buffer().

int grain_buffer_grains = 0

The number of grains currently in the intermediate buffer.

See also
grain_buffer

Referenced by cl_regopt_analyse(), read_disjunction(), and update_grain_buffer().

int grain_buffer_len[MAX_GRAINS]

the length of each grain (in characters)

Referenced by cl_regopt_analyse(), read_disjunction(), and update_grain_buffer().

char local_grain_data[CL_MAX_LINE_LENGTH]

A buffer for grain strings.

See also
public_grain_data

Referenced by cl_regopt_analyse(), read_disjunction(), and regopt_data_copy_to_regex_object().

char public_grain_data[CL_MAX_LINE_LENGTH]

A buffer for grain strings.

See also
local_grain_data

Referenced by update_grain_buffer().