CWB
Functions | Variables
cwb-describe-corpus.c File Reference
#include "../cl/globals.h"
#include "../cl/corpus.h"
#include "../cl/attributes.h"
#include "../cl/macros.h"

Functions

void describecorpus_usage (void)
 Prints a message describing how to use the program to STDERR and then exits. More...
 
void describecorpus_show_attribute_names (Corpus *corpus, int type)
 Prints the names of attributes in a corpus to STDOUT. More...
 
void describecorpus_show_basic_info (Corpus *corpus, int with_attribute_names)
 Prints basic information about a corpus to STDOUT. More...
 
void describecorpus_show_statistics (Corpus *corpus)
 Prints statistical information about a corpus to STDOUT. More...
 
int main (int argc, char **argv)
 Main function for cwb-describe-corpus. More...
 

Variables

char * progname = NULL
 String set to the name of this program. More...
 

Function Documentation

void describecorpus_show_attribute_names ( Corpus corpus,
int  type 
)

Prints the names of attributes in a corpus to STDOUT.

Only one type of attribute is analysed.

Parameters
corpusThe corpus to analyse.
typeThe type of attribute to show. This should be one of the constants in cl.h (ATT_POS etc.)

References _Attribute::any, TCorpus::attributes, print_indented_list_item(), and start_indented_list().

Referenced by describecorpus_show_basic_info().

void describecorpus_show_basic_info ( Corpus corpus,
int  with_attribute_names 
)

Prints basic information about a corpus to STDOUT.

Parameters
corpusThe corpus to report on.
with_attribute_namesBoolean: iff true, the counts of each type of attribute are followed by a list of attribute names.

References _Attribute::any, ATT_ALIGN, ATT_POS, ATT_STRUC, TCorpus::attributes, TCorpus::charset, cl_charset_name(), cl_max_cpos(), cl_new_attribute, describecorpus_show_attribute_names(), TCorpus::info_file, TCorpus::name, TCorpus::path, TCorpus::registry_dir, TCorpus::registry_name, and word.

Referenced by main().

void describecorpus_show_statistics ( Corpus corpus)

Prints statistical information about a corpus to STDOUT.

Each corpus attribute gets info printed about it: tokens and types for a P-attribute, number of instances of regions for an S-attribute, number of alignment blocks for an A-attribute.

Parameters
corpusThe corpus to analyse.

References _Attribute::any, ATT_ALIGN, ATT_POS, ATT_STRUC, TCorpus::attributes, cl_has_extended_alignment(), cl_max_alg(), cl_max_cpos(), cl_max_id(), cl_max_struc(), and cl_struc_values().

Referenced by main().

void describecorpus_usage ( void  )

Prints a message describing how to use the program to STDERR and then exits.

References progname, and VERSION.

Referenced by main().

int main ( int  argc,
char **  argv 
)

Main function for cwb-describe-corpus.

Prints information about an indexed corpus to STDOUT.

Parameters
argcNumber of command-line arguments.
argvCommand-line arguments.

References cl_delete_corpus(), cl_new_corpus(), corpus, describe_corpus(), describecorpus_show_basic_info(), describecorpus_show_statistics(), describecorpus_usage(), progname, and registry.

Variable Documentation

char* progname = NULL

String set to the name of this program.

Referenced by describecorpus_usage(), and main().