data.segmentation.avg

data.segmentation.avg can serves as the counterpart to data.avg that works on intervals defined by segments instead of fixed times. Like data.avg it rectangularizes its input. In addition to the time data.segmentation.avg may also generate names for the segments, see below.

Command Line Usage

data.segmetnation.avg [--stddev=on/off | --nostddev] 
                      [--count=on/off] [--source=archive]
                      [--contam] [--contam-match=X1,X2...]
                      [--require-uncontaminated=N]
                      [--decimal-format=FORMAT,MVC] [--remove=end=seconds]
                      [--name=(Segment1)..(Segement2)]
                      segment station records start [[end] archive]

Arguments

start and end

The time specifiers for the data to be retrieved. Start is inclusive while end is exclusive, so all segments contained within the half open interval [start,end) will be returned. Any convertible time format is accepted. Note that only data within the start and end range is averaged even if there are segments in that range that extends beyond them.

station

The station identifier code. For example 'brw'. Case insensitive.

records

The cpd2 record type to be retrieved. For example: 'S11a'. Case sensitive. Multiple record types may be separated by “,”, “;” or “:”. May also be “-” to read from standard input.

segment

The segment type to split based on.

--stddev=on/off --nostddev

Enable or disable (default enabled) standard deviation calculation for each field. A separate field for each field is generated that contains the standard deviation for that record.

--count=on/off

Enable or disable (default disable) generating a count field for each variable (after cut splitting). The count field is the variable name with an “N” added to the end.

archive or --source=archive

Selected the source archive to request data from, defaulting to clean. Has no effect when working on standard input.

--contam

Enable averaging of contaminated data. The variables that are ignored during contamination are set either with the contam-match switch or loaded from contaminate.conf.

--contam-match=pattern1,pattern2,...

List of Perl regular expressions that define all variables that are affected by contamination. Any variable that matches one of these patterns will not be averaged when it is contaminated (assuming –contam has not been set). If this option is set then contaminate.conf is not used, instead these patterns are applied for all time.

--require-uncontaminated=N

Set the fraction of data required to be uncontaminated before any average is emitted. That is, if the fraction of data that is non-MVC but is contaminated in the input is less than N then the output is an MVC.

--decimal-format=FORMAT,MVC

Set the format and/or MVC of all decimal numbers that are averaged. For example “–decimal-format=%0+10.3e,+9.999e-99” would cause all decimal floating point numbers to be output in scientific notation instead of their existing decimal format.

--remove-end=seconds

Set the number of seconds to subtract from the end of all segments.

--name=(Segment1)..(Segment2)

Set the value for the NAME variable. Anything between parenthesis is interpreted as follows:

Contents Result
'(' or ')' The literal value is inserted.
YEAR The starting year.
DOY The starting DOY.
sISO The starting time in ISO 8601 format.
eISO The ending time in ISO 8601 format.
Name The data value for the segment type 'Name' at this time.
DAY:Name The sequence number for segment type 'Name' as a character (A-Z).

So to construct a name field like flight segment ID (YYYY_DOY_X_SITE_SEG) the switch would be:

--name=(YEAR)_(DOY)_(DAY:flight)_(site)_(leg)

To build a label with the start and end times of the segment, in ISO format, along with the data value for segment type 'F21_chemfilter' the switch would be:

--name=(sISO)_(eISO)_(F21_chemfilter)

If omitted the records will not contain a name variable.

Example Usage

Average all flights on a given day

data.segmentation.avg --name=(YEAR)_(DOY)_(DAY:flight)_(site)_(leg) flight aao S11a 2008:10 2008:11

Average a single flight

data.segmentation.avg --name=(YEAR)_(DOY)_(DAY:flight)_(site)_(leg) flight aao S11a `data.flight aao 201`

Filling data from a pipe to a file

data.get sgp A11a 2008 2009 raw | data.avg --fill > 2008_hourly

Calculate average scattering and absorption for filter samples

data.avg --interval=0 --nostddev --contam cpt S11a,A11a 2008 2011 | \
data.segmentation.avg --nostddev --contam --name="(F81_PM1_filter)_(sISO)_(eISO)" F81_PM1_filter cpt - 2008 2011 | \
data.edit.wl | \ 
data.export --mode=xl