CWB
Macros | Functions
cwb-decode-nqrfile.c File Reference

See cqp/corpmanag.c for the file format that this utility decodes. More...

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>

Macros

#define SUBCORPMAGIC   36193928
 magic number of the subcorpus file format; defined in the CQP code, corpmanag.c ; TODO should probably be defined centrally (cl/globals.h?) More...
 

Functions

int file_length (FILE *fd)
 Gets the size of the file. More...
 
int nqrfile_print_info (FILE *fd, int print_header)
 Reads a subcorpus file and prints information about it to STDOUT. More...
 
int main (int argc, char **argv)
 Main function for cwb-decode-nqrfile. More...
 

Detailed Description

See cqp/corpmanag.c for the file format that this utility decodes.

Note, some of the code is repeated across CQP's load-file functions and here. In the long term, we'll aim to remove this duplication. TODO!

Macro Definition Documentation

#define SUBCORPMAGIC   36193928

magic number of the subcorpus file format; defined in the CQP code, corpmanag.c ; TODO should probably be defined centrally (cl/globals.h?)

Referenced by nqrfile_print_info().

Function Documentation

int file_length ( FILE *  fd)

Gets the size of the file.

Parameters
fdFile handle.
Returns
The size of the file, from stat().

Referenced by nqrfile_print_info().

int main ( int  argc,
char **  argv 
)

Main function for cwb-decode-nqrfile.

Parameters
argcNumber of command-line arguments.
argvCommand-line arguments.

References nqrfile_print_info().

int nqrfile_print_info ( FILE *  fd,
int  print_header 
)

Reads a subcorpus file and prints information about it to STDOUT.

"Subcorpus file" here means (a) it begins with the subcorpus magic number; (b) then there is a "registry" area terminated by one or more zero bytes; (c) then there may be the size of the subcorpus; (d) then there are a whole load of start-end range integer pairs, to the end of the file.

The registry is printed iff print_header. The start-end pairs are printed on tab-delimited lines, one line per pair.

Parameters
fdFile pointer.
print_headerBoolean: controls whether a "registry" header in the subcorpus file gets printed or not
Returns
Boolean: true for all OK, false for problem.

References file_length(), registry, and SUBCORPMAGIC.

Referenced by main().