TimeStreams¶
TSTK tools operate on timestreams, a format for the storage of time series
file data. Typically, timestreams are series of image files (tif, cr2, or jpg
typically) from a camera, captures over the life of some field station or lab
experiment.
A timestream is (traditionally, c.f. bundling below) a directory structure for storing time-series imaging, and a naming convention for files. Specifically, images are stored under directories for the year, month, day, and hour, preventing there being more than a few thousand images per directory for even the most prolific of cameras. On disk, this looks like:
camera
└── 2001
└── 2001_02
├── 2001_02_01
│ ├── 2001_02_01_09
│ │ └── camera_2001_02_01_09_14_15.tif
│ ├── 2001_02_01_10
│ │ └── camera_2001_02_01_10_14_15.tif
│ ├── 2001_02_01_11
│ │ └── camera_2001_02_01_11_14_15.tif
│ ├── 2001_02_01_12
│ │ └── camera_2001_02_01_12_14_15.tif
│ └── 2001_02_01_13
│ └── camera_2001_02_01_13_14_15.tif
└── 2001_02_02
├── 2001_02_02_09
│ └── camera_2001_02_02_09_14_15.tif
├── 2001_02_02_10
│ └── camera_2001_02_02_10_14_15.tif
├── 2001_02_02_11
│ └── camera_2001_02_02_11_14_15.tif
├── 2001_02_02_12
│ └── camera_2001_02_02_12_14_15.tif
└── 2001_02_02_13
└── camera_2001_02_02_13_14_15.tif
Timestream files all have a specific file basename pattern too. Namely, (using strftime codes and shell-style variables):
${CAMERA_NAME}_%Y_%m_%d_%H_%M_%S${INDEX}.${EXT}
where:
$CAMERA_NAMEis the camera name (cameraabove), and can contain any valid alphanumeric characters and some punctuation (all of~-.,, and if you must,_, although having underscores in camera names will make life harder for you).$INDEXis a generic field denoting allowing sub-timepoint resolution through either time or space. If present, an underscore MUST precede this field. As far as tstk is concerned, index values are strings, with any parsing to be performed by application-specific code. We use the index for sub-second time resolution on fast cameras, denoting plant names or IDs, or ROI names in cropped outputs, and image position in panoramic imaging applications. We suggest you left-pad numbers with zeros to allow correct image sorting, as indexes are sorted as strings, so unpadded numbers will be iterated over lexicographically (e.g. use printf-like code%04dor similar).$EXTis the file’s extension. On output, this will always be lower case, but some smart normalisation of both alternative spellings (jpg vs jpeg) and case (JPG vs jpg) is performed on inputs.
Transparent bundling¶
Often, a filesystem (or sanity) has limits on the count of files. Time-series
imaging can create a huge number of files with only modest capture frequencies
and numbers of cameras. Therefore, tstk can transparently bundle
timestreams into either .zip or .tar files. Specifically, one can
bundle files below some level of the directory hierarchy (see above) by
creating per-hour/day/month/year/timestream .zip or .tar files. tstk can output
to bundled files (see arguments like --bundle).