TimeStreams

TSTK tools operate on timestreams, a format for the storage of time series file data. Typically, timestreams are series of image files (tif, cr2, or jpg typically) from a camera, captures over the life of some field station or lab experiment.

A timestream is (traditionally, c.f. bundling below) a directory structure for storing time-series imaging, and a naming convention for files. Specifically, images are stored under directories for the year, month, day, and hour, preventing there being more than a few thousand images per directory for even the most prolific of cameras. On disk, this looks like:

camera
└── 2001
    └── 2001_02
        ├── 2001_02_01
        │   ├── 2001_02_01_09
        │   │   └── camera_2001_02_01_09_14_15.tif
        │   ├── 2001_02_01_10
        │   │   └── camera_2001_02_01_10_14_15.tif
        │   ├── 2001_02_01_11
        │   │   └── camera_2001_02_01_11_14_15.tif
        │   ├── 2001_02_01_12
        │   │   └── camera_2001_02_01_12_14_15.tif
        │   └── 2001_02_01_13
        │       └── camera_2001_02_01_13_14_15.tif
        └── 2001_02_02
            ├── 2001_02_02_09
            │   └── camera_2001_02_02_09_14_15.tif
            ├── 2001_02_02_10
            │   └── camera_2001_02_02_10_14_15.tif
            ├── 2001_02_02_11
            │   └── camera_2001_02_02_11_14_15.tif
            ├── 2001_02_02_12
            │   └── camera_2001_02_02_12_14_15.tif
            └── 2001_02_02_13
                └── camera_2001_02_02_13_14_15.tif

Timestream files all have a specific file basename pattern too. Namely, (using strftime codes and shell-style variables):

${CAMERA_NAME}_%Y_%m_%d_%H_%M_%S${INDEX}.${EXT}

where:

  • $CAMERA_NAME is the camera name (camera above), and can contain any valid alphanumeric characters and some punctuation (all of ~-.,, and if you must, _, although having underscores in camera names will make life harder for you).

  • $INDEX is a generic field denoting allowing sub-timepoint resolution through either time or space. If present, an underscore MUST precede this field. As far as tstk is concerned, index values are strings, with any parsing to be performed by application-specific code. We use the index for sub-second time resolution on fast cameras, denoting plant names or IDs, or ROI names in cropped outputs, and image position in panoramic imaging applications. We suggest you left-pad numbers with zeros to allow correct image sorting, as indexes are sorted as strings, so unpadded numbers will be iterated over lexicographically (e.g. use printf-like code %04d or similar).

  • $EXT is the file’s extension. On output, this will always be lower case, but some smart normalisation of both alternative spellings (jpg vs jpeg) and case (JPG vs jpg) is performed on inputs.

Transparent bundling

Often, a filesystem (or sanity) has limits on the count of files. Time-series imaging can create a huge number of files with only modest capture frequencies and numbers of cameras. Therefore, tstk can transparently bundle timestreams into either .zip or .tar files. Specifically, one can bundle files below some level of the directory hierarchy (see above) by creating per-hour/day/month/year/timestream .zip or .tar files. tstk can output to bundled files (see arguments like --bundle).