libBigWig
 All Data Structures Files Functions Variables Typedefs Enumerations Macros Pages
Data Structures | Macros | Enumerations | Functions
bigWig.h File Reference
#include "io.h"
#include "bwValues.h"
#include <inttypes.h>
#include <zlib.h>

Go to the source code of this file.

Data Structures

struct  bwZoomHdr_t
 BigWig files have multiple "zoom" levels, each of which has its own header. This hold those headers. More...
 
struct  bigWigHdr_t
 The header section of a bigWig file. More...
 
struct  chromList_t
 Holds the chromosomes and their lengths. More...
 
struct  bwWriteBuffer_t
 This is only needed for writing bigWig files (and won't be created otherwise) This should be removed from bigWig.h. More...
 
struct  bigWigFile_t
 A structure that holds everything needed to access a bigWig file. More...
 
struct  bwOverlappingIntervals_t
 Holds interval:value associations. More...
 

Macros

#define LIBBIGWIG_VERSION   0.1
 
#define BIGWIG_MAGIC   0x888FFC26
 
#define CIRTREE_MAGIC   0x78ca8c91
 
#define IDX_MAGIC   0x2468ace0
 
#define DEFAULT_nCHILDREN   64
 
#define DEFAULT_BLOCKSIZE   32768
 

Enumerations

enum  bwStatsType {
  doesNotExist = -1, mean = 0, average = 0, std = 1,
  stdev = 1, dev = 1, max = 2, min = 3,
  cov = 4, coverage = 4
}
 

Functions

int bwInit (size_t bufSize)
 Initializes curl and global variables. This MUST be called before other functions (at least if you want to connect to remote files). For remote file, curl must be initialized and regions of a file read into an internal buffer. If the buffer is too small then an excessive number of connections will be made. If the buffer is too large than more data than required is fetched. 128KiB is likely sufficient for most needs. More...
 
void bwCleanup (void)
 The counterpart to bwInit, this cleans up curl. More...
 
bigWigFile_tbwOpen (char *fname, CURLcode(*callBack)(CURL *), const char *mode)
 Opens a local or remote bigWig file. This will open a local or remote bigWig file. More...
 
void bwClose (bigWigFile_t *fp)
 Closes a bigWigFile_t and frees up allocated memory. More...
 
uint32_t bwGetTid (bigWigFile_t *fp, char *chrom)
 Converts between chromosome name and ID. More...
 
void bwDestroyOverlappingIntervals (bwOverlappingIntervals_t *o)
 Frees space allocated by bwGetOverlappingIntervals More...
 
bwOverlappingIntervals_tbwGetOverlappingIntervals (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end)
 Return entries overlapping an interval. Find all entries overlapping a range and returns them, including their associated values. More...
 
bwOverlappingIntervals_tbwGetValues (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end, int includeNA)
 Return all per-base values in a given interval. Given an interval (e.g., chr1:0-100), return the value at each position. Positions without associated values are suppressed by default, but may be returned if includeNA is not 0. More...
 
double * bwStats (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t end, uint32_t nBins, enum bwStatsType type)
 Determines per-interval statistics Can determine mean/min/max/coverage/standard deviation of values in one or more intervals. You can optionally give it an interval and ask for values from X number of sub-intervals. More...
 
int bwCreateHdr (bigWigFile_t *fp, int32_t maxZooms)
 Create a largely empty bigWig header Every bigWig file has a header, this creates the template for one. It also takes care of space allocation in the output write buffer. More...
 
chromList_tbwCreateChromList (char **chroms, uint32_t *lengths, int64_t n)
 Take a list of chromosome names and lengths and return a pointer to a chromList_t This MUST be run before bwWriteHdr(). Note that the input is NOT free()d! More...
 
int bwWriteHdr (bigWigFile_t *bw)
 Write a the header to a bigWig file. You must have already opened the output file, created a header and a chromosome list. More...
 
int bwAddIntervals (bigWigFile_t *fp, char **chrom, uint32_t *start, uint32_t *end, float *values, uint32_t n)
 Write a new block of bedGraph-like intervals to a bigWig file Adds entries of the form: chromosome start end value to the file. These will always be added in a new block, so you may have previously used a different storage type. More...
 
int bwAppendIntervals (bigWigFile_t *fp, uint32_t *start, uint32_t *end, float *values, uint32_t n)
 Append bedGraph-like intervals to a previous block of bedGraph-like intervals in a bigWig file. If you have previously used bwAddIntervals() then this will append additional entries into the previous block (or start a new one if needed). More...
 
int bwAddIntervalSpans (bigWigFile_t *fp, char *chrom, uint32_t *start, uint32_t span, float *values, uint32_t n)
 Add a new block of variable-step entries to a bigWig file Adds entries for the form chromosome start value to the file. Each block of such entries has an associated "span", so each value describes the region chromosome:start-(start+span) More...
 
int bwAppendIntervalSpans (bigWigFile_t *fp, uint32_t *start, float *values, uint32_t n)
 Append to a previous block of variable-step entries. If you previously used bwAddIntervalSpans(), this will continue appending more values to the block(s) it created. More...
 
int bwAddIntervalSpanSteps (bigWigFile_t *fp, char *chrom, uint32_t start, uint32_t span, uint32_t step, float *values, uint32_t n)
 Add a new block of fixed-step entries to a bigWig file Adds entries for the form value to the file. Each block of such entries has an associated "span", "step", chromosome and start position. See the wiggle format for more details. More...
 
int bwAppendIntervalSpanSteps (bigWigFile_t *fp, float *values, uint32_t n)
 Append to a previous block of fixed-step entries. If you previously used bwAddIntervalSpanSteps(), this will continue appending more values to the block(s) it created. More...
 

Detailed Description

These are the functions and structured that should be used by external users. While I don't particularly recommend dealing with some of the structures (e.g., a bigWigHdr_t), they're described here in case you need them.

BTW, this library doesn't switch endianness as appropriate, since I kind of assume that there's only one type produced these days.

Macro Definition Documentation

#define BIGWIG_MAGIC   0x888FFC26

The magic number of a bigWig file.

#define CIRTREE_MAGIC   0x78ca8c91

The magic number of a "cirTree" block in a file.

#define DEFAULT_BLOCKSIZE   32768

The default decompression buffer size in bytes. This is used to determin

#define DEFAULT_nCHILDREN   64

The default number of children per block.

#define IDX_MAGIC   0x2468ace0

The magic number of an index block in a file.

#define LIBBIGWIG_VERSION   0.1

The library version number

Enumeration Type Documentation

An enum that dictates the type of statistic to fetch for a given interval

Function Documentation

int bwAddIntervals ( bigWigFile_t fp,
char **  chrom,
uint32_t *  start,
uint32_t *  end,
float *  values,
uint32_t  n 
)

Write a new block of bedGraph-like intervals to a bigWig file Adds entries of the form: chromosome start end value to the file. These will always be added in a new block, so you may have previously used a different storage type.

In general it's more efficient to use the bwAppend* functions, but then you MUST know that the previously written block is of the same type. In other words, you can only use bwAppendIntervals() after bwAddIntervals() or a previous bwAppendIntervals().

Parameters
fpThe output file pointer.
chromA list of chromosomes, of length n.
startA list of start positions of lengthn.
endA list of end positions of lengthn.
valuesA list of values of lengthn.
nThe length of the aforementioned lists.
Returns
0 on success and another value on error.
See Also
bwAppendIntervals
int bwAddIntervalSpans ( bigWigFile_t fp,
char *  chrom,
uint32_t *  start,
uint32_t  span,
float *  values,
uint32_t  n 
)

Add a new block of variable-step entries to a bigWig file Adds entries for the form chromosome start value to the file. Each block of such entries has an associated "span", so each value describes the region chromosome:start-(start+span)

This will always start a new block of values.

Parameters
fpThe output file pointer.
chromA list of chromosomes, of length n.
startA list of start positions of lengthn.
spanThe span of each entry (the must all be the same).
valuesA list of values of lengthn.
nThe length of the aforementioned lists.
Returns
0 on success and another value on error.
See Also
bwAppendIntervalSpans
int bwAddIntervalSpanSteps ( bigWigFile_t fp,
char *  chrom,
uint32_t  start,
uint32_t  span,
uint32_t  step,
float *  values,
uint32_t  n 
)

Add a new block of fixed-step entries to a bigWig file Adds entries for the form value to the file. Each block of such entries has an associated "span", "step", chromosome and start position. See the wiggle format for more details.

This will always start a new block of values.

Parameters
fpThe output file pointer.
chromThe chromosome that the entries describe.
startThe starting position of the block of entries.
spanThe span of each entry (i.e., the number of bases it describes).
stepThe step between entry start positions.
valuesA list of values of lengthn.
nThe length of the aforementioned lists.
Returns
0 on success and another value on error.
See Also
bwAddIntervalSpanSteps
int bwAppendIntervals ( bigWigFile_t fp,
uint32_t *  start,
uint32_t *  end,
float *  values,
uint32_t  n 
)

Append bedGraph-like intervals to a previous block of bedGraph-like intervals in a bigWig file. If you have previously used bwAddIntervals() then this will append additional entries into the previous block (or start a new one if needed).

Parameters
fpThe output file pointer.
startA list of start positions of lengthn.
endA list of end positions of lengthn.
valuesA list of values of lengthn.
nThe length of the aforementioned lists.
Returns
0 on success and another value on error.
Warning
Do NOT use this after bwAddIntervalSpanSteps(), bwAppendIntervalSpanSteps(), bwAddIntervalSpanSteps(), or bwAppendIntervalSpanSteps().
See Also
bwAddIntervals
int bwAppendIntervalSpans ( bigWigFile_t fp,
uint32_t *  start,
float *  values,
uint32_t  n 
)

Append to a previous block of variable-step entries. If you previously used bwAddIntervalSpans(), this will continue appending more values to the block(s) it created.

Parameters
fpThe output file pointer.
startA list of start positions of lengthn.
valuesA list of values of lengthn.
nThe length of the aforementioned lists.
Returns
0 on success and another value on error.
Warning
Do NOT use this after bwAddIntervals(), bwAppendIntervals(), bwAddIntervalSpanSteps() or bwAppendIntervalSpanSteps()
See Also
bwAddIntervalSpans
int bwAppendIntervalSpanSteps ( bigWigFile_t fp,
float *  values,
uint32_t  n 
)

Append to a previous block of fixed-step entries. If you previously used bwAddIntervalSpanSteps(), this will continue appending more values to the block(s) it created.

Parameters
fpThe output file pointer.
valuesA list of values of lengthn.
nThe length of the aforementioned lists.
Returns
0 on success and another value on error.
Warning
Do NOT use this after bwAddIntervals(), bwAppendIntervals(), bwAddIntervalSpans() or bwAppendIntervalSpans()
See Also
bwAddIntervalSpanSteps
void bwCleanup ( void  )

The counterpart to bwInit, this cleans up curl.

See Also
bwInit
void bwClose ( bigWigFile_t fp)

Closes a bigWigFile_t and frees up allocated memory.

Parameters
fpThe file pointer.
chromList_t* bwCreateChromList ( char **  chroms,
uint32_t *  lengths,
int64_t  n 
)

Take a list of chromosome names and lengths and return a pointer to a chromList_t This MUST be run before bwWriteHdr(). Note that the input is NOT free()d!

Parameters
chromsA list of chromosomes.
lengthsThe length of each chromosome.
nThe number of chromosomes (thus, the length of chroms and lengths)
Returns
A pointer to a chromList_t or NULL on error.
int bwCreateHdr ( bigWigFile_t fp,
int32_t  maxZooms 
)

Create a largely empty bigWig header Every bigWig file has a header, this creates the template for one. It also takes care of space allocation in the output write buffer.

Parameters
fpThe bigWigFile_t* that you want to write to.
maxZoomsThe maximum number of zoom levels. If you specify 0 then there will be no zoom levels. A value <0 or > 65535 will result in a maximum of 10.
Returns
0 on success.
void bwDestroyOverlappingIntervals ( bwOverlappingIntervals_t o)

Frees space allocated by bwGetOverlappingIntervals

Parameters
oA valid bwOverlappingIntervals_t pointer.
See Also
bwGetOverlappingIntervals
bwOverlappingIntervals_t* bwGetOverlappingIntervals ( bigWigFile_t fp,
char *  chrom,
uint32_t  start,
uint32_t  end 
)

Return entries overlapping an interval. Find all entries overlapping a range and returns them, including their associated values.

Parameters
fpA valid bigWigFile_t pointer.
chromA valid chromosome name.
startThe start position of the interval. This is 0-based half open, so 0 is the first base.
endThe end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99.
Returns
NULL on error or no overlapping values, otherwise a bwOverlappingIntervals_t * holding the values and intervals.
See Also
bwOverlappingIntervals_t
bwDestroyOverlappingIntervals
bwGetValues
uint32_t bwGetTid ( bigWigFile_t fp,
char *  chrom 
)

Converts between chromosome name and ID.

Parameters
fpA valid bigWigFile_t pointer
chromA chromosome name
Returns
An ID, -1 will be returned on error (note that this is an unsigned value, so that's ~4 billion. BigWig files can't store that many chromosomes anyway.
bwOverlappingIntervals_t* bwGetValues ( bigWigFile_t fp,
char *  chrom,
uint32_t  start,
uint32_t  end,
int  includeNA 
)

Return all per-base values in a given interval. Given an interval (e.g., chr1:0-100), return the value at each position. Positions without associated values are suppressed by default, but may be returned if includeNA is not 0.

Parameters
fpA valid bigWigFile_t pointer.
chromA valid chromosome name.
startThe start position of the interval. This is 0-based half open, so 0 is the first base.
endThe end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99.
includeNAIf not 0, report NA values as well (as NA).
Returns
NULL on error or no overlapping values, otherwise a bwOverlappingIntervals_t * holding the values and positions.
See Also
bwOverlappingIntervals_t
bwDestroyOverlappingIntervals
bwGetOverlappingIntervals
int bwInit ( size_t  bufSize)

Initializes curl and global variables. This MUST be called before other functions (at least if you want to connect to remote files). For remote file, curl must be initialized and regions of a file read into an internal buffer. If the buffer is too small then an excessive number of connections will be made. If the buffer is too large than more data than required is fetched. 128KiB is likely sufficient for most needs.

Parameters
bufSizeThe internal buffer size used for remote connection.
See Also
bwCleanup
Returns
0 on success and 1 on error.
bigWigFile_t* bwOpen ( char *  fname,
CURLcode(*)(CURL *)  callBack,
const char *  mode 
)

Opens a local or remote bigWig file. This will open a local or remote bigWig file.

Parameters
fnameThe file name or URL (http, https, and ftp are supported)
callBackAn optional user-supplied function. This is applied to remote connections so users can specify things like proxy and password information. See test/testRemote for an example.
modeThe mode, by default "r". Both local and remote files can be read, but only local files can be written. For files being written the callback function is ignored. If and only if the mode contains "w" will the file be opened for writing (in all other cases the file will be opened for reading.
Returns
A bigWigFile_t * on success and NULL on error.
double* bwStats ( bigWigFile_t fp,
char *  chrom,
uint32_t  start,
uint32_t  end,
uint32_t  nBins,
enum bwStatsType  type 
)

Determines per-interval statistics Can determine mean/min/max/coverage/standard deviation of values in one or more intervals. You can optionally give it an interval and ask for values from X number of sub-intervals.

Parameters
fpThe file from which to extract statistics.
chromA valid chromosome name.
startThe start position of the interval. This is 0-based half open, so 0 is the first base.
endThe end position of the interval. Again, this is 0-based half open, so 100 will include the 100th base...which is at position 99.
nBinsThe number of bins within the interval to calculate statistics for.
typeThe type of statistic.
See Also
bwStatsType
Returns
A pointer to an array of double precission floating point values. Note that bigWig files only hold 32-bit values, so this is done to help prevent overflows.
int bwWriteHdr ( bigWigFile_t bw)

Write a the header to a bigWig file. You must have already opened the output file, created a header and a chromosome list.

Parameters
bwThe output bigWigFile_t pointer.
See Also
bwCreateHdr
bwCreateChromList