Helios++
Helios software for LiDAR simulations
|
Class to read design matrices. More...
#include <DesignMatrixReader.h>
Public Member Functions | |
DesignMatrixReader (string const &path, string const &sep=",", string const &com="#", long const maxCharsPerLine=8192, size_t const bufferSize=100000) | |
Default constructor for DesignMatrixReader. More... | |
virtual fluxionum::DesignMatrix< VarType > | read (unordered_map< string, string > *keyval=nullptr) |
Read the design matrix. More... | |
Protected Member Functions | |
virtual void | parseComment (string const &str, size_t const comIdx, vector< string > &header, unordered_map< string, string > *keyval) |
Parse a comment line. More... | |
virtual void | parseRow (string const &str, size_t const nonEmptyIdx, vector< VarType > &values) |
Parse a row. More... | |
virtual bool | isSpecComment (string const &str, size_t &colonIdx) |
Check whether the comment string is a specification comment or not. More... | |
virtual void | extractSpecCommentKeyValue (string const &str, size_t const comIdx, size_t const colonIdx, string &key, string &val) |
Extract the key and value from given string representing a specification comment. More... | |
virtual void | parseColumnNames (string const &val, vector< string > &header) |
Parse the list of inline column names. More... | |
Protected Attributes | |
BufferedLineFileReader | br |
The buffered line file reader to read the design matrix. | |
string | sep |
The field separator. | |
string | com |
The comment string token . Any line which first non space and non tab substring matches the comment string, is a commented line. | |
Class to read design matrices.
A design matrix file is a CSV text file with 2 special type of string tokens. By default the separator token is the comma "," and the default comment token is the "#". Thus, it is possible to have 2 different types of lines.
The first type of line is the comment line. Any line which first non space and non tab substring is a comment token, is a comment line. There are 2 different types of comments. The typical comments, which are simply ignored during parsing and are useful for human understanding purposes only. And the specification comments that can be used to specify some extra information to the parser. Specification comments can be used to specify a header, such that:
#HEADER: "field1","field2","field3"
The second type of line is the data record, data row or data point. It is a line of \(n\) separated fields each such that:
0.5, 1.0, 2.1
VarType | The type of value for the design matrix |
|
inline |
Default constructor for DesignMatrixReader.
path | Path to the input file containing a Design Matrix |
sep | The field separator |
maxCharsPerLine | Maximum number of characters per line |
bufferSize | The buffer size in number of lines |
|
protectedvirtual |
Extract the key and value from given string representing a specification comment.
str | String representing the specification comment | |
[in] | comIdx | The index of the first character of the comment token in the line |
[in] | colonIdx | The index of the colon separator between key and value |
[out] | key | Where the extracted key will be stored |
[out] | val | Where the extracted value will be stored |
|
protectedvirtual |
Check whether the comment string is a specification comment or not.
The method assumes the string is in fact a comment, so it must be to work properly. It only checks whether it is a typical comment or a specification comment
str | The comment line being parsed. It MUST be a comment line. The method assumes it is a comment line and only checks whether it is a specification comment or not | |
[out] | colonIdx | The index of the colon separator will be stored here. When it is a specification comment it will be distinct than string::npos (true), otherwise it will be exactly string::npos (false) |
|
protectedvirtual |
Parse the list of inline column names.
val | Inline list of column names. Each name must be separated by the separator character DesignMatrixReader::sep | |
[out] | header | Where the column names are stored |
|
protectedvirtual |
Parse a comment line.
str | The line being parsed | |
comIdx | The index of the first character of the comment token in the line | |
[out] | header | The vector of column names defining the header. |
[out] | keyval | If it is not null, then it will be used to store all key value pairs from specification comments (unless the header) If the parsed comment is a header specification, then it will be filled |
|
protectedvirtual |
Parse a row.
str | The line being parsed | |
nonEmptyIdx | The index of the first non empty character in the line | |
[out] | values | The vector of values defining the contents of the DesignMatrix |
|
virtual |
Read the design matrix.
[out] | keyval | If it is not null, then it will be used to store all key value pairs from specification comments (unless the header) |