API Reference¶
FileLock¶
- exception pyts2.filelock.FileLockException¶
Exception for file locking. Raised when unable to acquire file lock, via
acquire().
- class pyts2.filelock.FileLock(file_name, timeout=10, delay=0.05)¶
A file locking mechanism that has context-manager support so you can use it in a with statement. This should be relatively cross compatible as it doesn’t rely on msvcrt or fcntl for the locking. Can be used either through
acquire()orwith FileLock():.- Parameters
file_name (str) – Path of file to lock.
timeout (numeric, optional) – Seconds to wait for locking process to complete. Defaults to
TSTK_TIMEOUTenvironment variable, if set.delay (numeric, optional) – Seconds to wait between checking for file availability. Defaults to 0.5 seconds.
- acquire()¶
Acquire the lock, if possible. If the lock is in use, it check again every
delayseconds. It does this until it either gets the lock or exceedstimeoutnumber of seconds, in which case it throws an exception.- Raises
FileLockException – Unable to acquire lock.
- release()¶
Get rid of the lock by deleting the lockfile. When working in a
withstatement, this gets automatically called at the end.
Removalist¶
Time¶
- pyts2.time.parse_date(datestr)¶
Parses dates in iso8601-ish formats to
datetime.datetimeobjects- Parameters
datestr (str) – A string containing a datetime
- Returns
Datetime object representing the given string
- Return type
datetime.datetimeobject
- class pyts2.time.TSInstant(datetime, index=None)¶
TSInstant: a generalised “moment in time”, including both timepoint and optional index within a timepoint.
>>> TSInstant(datetime.datetime(2017, 1, 2, 3, 4, 5)) 2017_01_02_03_04_05 >>> TSInstant(datetime.datetime(2017, 1, 2, 3, 4, 5), "0011") 2017_01_02_03_04_05_0011
Initiates at a given datetime and optional index within that timepoint.
- Parameters
datetime (str) – A string in an ISO-8601-like format. (see
parse_date())index (int or string containing a usable int, optional) – Index number
- property index¶
Index of timepoint.
- Setter
Converts to int, stripping underscores if needed. Sets to None if 00 or empty string.
- iso8601()¶
Converts own datetime to a ISO-8601 string.
- Returns
Datetime string in ISO-8601 format.
- Return type
str
- static from_path(path)¶
Extract date and index from path to timestream image
- Parameters
path (str) – File path, with or without directory
- Returns
Datetime indicated by path
- Return type
>>> TSInstant.from_path("2001_02_03_23_59_59_00.jpg") 2001_02_03_23_59_59 >>> TSInstant.from_path("2001_02_03_23_59_59_indexhere.jpg") 2001_02_03_23_59_59_indexhere
- pyts2.time.parse_partial_date(datestr, max=False)¶
Parses date strings with implicit date format.
- Parameters
datestr (str) – A string that contains a date.
max (bool) – Default to maximum value within possible date range (e.g. if date only has up to hour precision, set minutes field to 59)
- Returns
Date, time
- Return type
datetime.datetime,datetime.datetime
- class pyts2.time.TimeFilter(startdate=None, enddate=None, starttime=None, endtime=None)¶
Check datetimes fall within a given time period.
Initiates with a datetime range to check against.
- Parameters
startdate (
datetime.dateobject,TSinstantobject, or date string.) – Start of date range.enddate (
datetime.dateobject,TSinstantobject, or date string.) – End of date range Must be later thanstartdate.startime (
datetime.timeobject,TSinstantobject, or time string.) – Start of time range per day within date range.endtime (
datetime.timeobject,TSinstantobject, or time string.) – End of time range per day within date range. Must be later thanstarttime.
- partial_within(datestr)¶
Checks if a given datetime is within the datetime range of the current object.
- Parameters
datestr (str) – String of datetime to check.
- Returns
True if within datetime range, False if not
- Return type
bool
Timestream¶
- pyts2.timestream.path_is_timestream_file(path, extensions=None)¶
Test if path given contains a valid timestream format, and optionally checks that it has an expected file extension.
- Parameters
path (str) – File path, with or without directory
extensions (str) – Optionally, one or more extensions to accept
- Returns
True if path is timestream and extension compatible, otherwise False
- Return type
bool
>>> path_is_timestream_file("test_2018_12_31_23_59_59_00.jpg") True >>> path_is_timestream_file("test_2018_12_31_23_59_59_00_1.jpg") True >>> path_is_timestream_file("2018_12_31_23_59_59_00.jpg") True >>> path_is_timestream_file("test_2018_12_31_23_59_59_00.jpg", extensions="jpg") True >>> path_is_timestream_file("test_2018_12_31_23_59_59_00.jpg", extensions="tif") False >>> path_is_timestream_file("not-a-timestream.jpg") False
- class pyts2.timestream.Fetcher¶
Gets files from archive bundles.
- classmethod from_json(obj)¶
Takes an object path and returns format-decoded file content.
- Parameters
obj (JSON object/dict) – File object to look up
- Returns
Fetched file
- Return type
File content
- class pyts2.timestream.ZipContentFetcher(archivepath, pathinzip)¶
Retrieves files from zip archives.
- Parameters
archivepath (str) – Path to zip file
pathinzip (str) – Path to file to retrieve, within zip
- get()¶
- Returns
File retrieved from zip archive.
- Return type
File content
- dict()¶
- Returns
Summary of fetch parameters
- Return type
dict
- class pyts2.timestream.TarContentFetcher(archivepath, pathintar)¶
Retrieves files from tar archives.
- Parameters
archivepath (str) – Path to tar file
pathinzip (str) – Path to file to retrieve, within tar file
- get()¶
- Returns
File retrieved from tar archive.
- Return type
File content
- dict()¶
- Returns
Summary of fetch parameters
- Return type
dict
- class pyts2.timestream.FileContentFetcher(path)¶
Retrieves files from disk, not in an archive.
- Parameters
path (str) – Path to file to retrieve
- get()¶
- Returns
File retrieved from disk.
- Return type
File content
- dict()¶
- Returns
Summary of fetch parameters
- Return type
dict
- class pyts2.timestream.TimestreamFile(instant=None, filename=None, fetcher=None, content=None, report=None, format=None)¶
A container class for files in timestreams
- Parameters
instant (TSInstant object or derived from file path) – Datetime point that this file represents
filename (str) – Name of file, can be retrieved from fetcher
fetcher (
Fetcher, optional) – File fetcher object to retrieve file content from disk or within archive formatscontent (File content, optional) – File content (usually if creating timestream file in pipelines)
report (dict) – Variable to store reports from Timestream pipeline components
format (str, optional) – File format, often the file extension (e.g.
jpg)
- clear_content()¶
Deletes all file content in object.
- classmethod from_path(path, instant=None)¶
Create timestream file from file path.
- Parameters
path (str) – Path to file
instant (
time.TSInstant, optional) – Timepoint associated with object. Attempts to use datetime in path if not provided.
- Returns
Timestream file
- Return type
- classmethod from_bytes(filebytes, filename, instant=None)¶
Create timestream file programmatically.
- Parameters
filebytes (bytes) – Bytes to encode
filename (str) – Name of file
instant (
time.TSInstant, optional) – Timepoint associated with object. Attempts to use datetime in path if not provided.
- Returns
Timestream file
- Return type
- isodate()¶
convenience helper to get iso8601 string
- checksum(algorithm='md5')¶
Checksum content of this object by the given algorithm.
- Parameters
algorithm (str, optional) – Algorithm name supported by
hashlib- Returns
Checksum of file contents
- Return type
string object of double length, containing only hexadecimal digits
- class pyts2.timestream.TimeStream(path=None, format=None, bundle_level='none', name=None, timefilter=None, add_subsecond_field=False, flat_output=False, write_index=False)¶
Represents a set of files organised in Timestream format.
- Parameters
path (str) – Base directory of a timestream
format (str) – Format of files in timestream, usually the same as the file extension (e.g.
.jpg)bundle_level (str) – Smallest time unit to bundle files at.
name (str) – Name of timestream
timefilter (
time.Timefilter, optional) – Timefilter object for defining datetime range to iterate overadd_subsecond_field (bool, optional) – Enable for timestreams with sub-second records, using an additional
_[00-99]at the end of filenamesflat_output (bool, optional) – Store timestream in a flat file structure, instead of Timestream directory structure
write_index (bool, optional) – Save file paths in timestream to an index.json file
- open(path, format=None)¶
Opens a stored timestream.
- Parameters
path (str) – Path to timestream file/directory
format (str, optional) – Timestream format, if not unarchived or in a tar/zip archive
- index(progress=True)¶
Update timestream index if empty. Useful before searching a timestream for specific timepoints. Also updates index file if one is set/exists.
- Parameters
progress (bool, optional) – Show progress of indexing with
tqdm
- getinstant(value)¶
Retrieves files from timestream at a specific time point.
- Parameters
value (
time.TSInstant) – Time point to retrieve- Returns
File if present
- Return type
- from_inotify(basedir)¶
Watch a directory for files using
inotifyand yield new files as they are added to it. Rescans every 10 minutes.- Parameters
basedir (str) – Directory to watch.
- from_fofn(pathorfile)¶
Yields files from a list of file names, most commonly from an index file.
- Parameters
pathorfile (str or
io.IOBase) – Path to an index file, or open file of file names- Returns
Iteratively returns files from file names
- Return type
- iter(tar_contents=True)¶
Yields files in timestream, sorted by time.
- Parameters
tar_contents (bool, optional) – Load tar contents into TimestreamFile immediately, rather than through TarContentFetcher
- Returns
Timestream files, iteratively
- Return type
- write(file)¶
Adds a file to this timestream.
- Parameters
file (
TimestreamFile) – A valid file for this timestream
- close()¶
Maintains interface as other objects with
close()methods but does not currently need further actions.
- class pyts2.timestream.InMemoryTimeStream(name=None, format=None, timefilter=None, add_subsecond_field=False, flat_output=False)¶
InMemoryTimestream doesnt support bundling yet
- Parameters
name (str) – Name of timestream
format (str) – Format of files in timestream, usually the same as the file extension (e.g.
.jpg)
# :param bundle_level: Smallest time unit to bundle files at. # :type bundle_level: str :param timefilter: Timefilter object for defining datetime range to iterate over :type timefilter:
time.Timefilter, optional :param add_subsecond_field: Enable for timestreams with sub-second records, using an additional_[00-99]at the end of filenames :type add_subsecond_field: bool, optional :param flat_output: Store timestream in a flat file structure, instead of Timestream directory structure :type flat_output: bool, optional- write(file)¶
Adds a file to this timestream.
- Parameters
file (
TimestreamFile) – A valid file for this timestream
- iter()¶
Yields files in timestream, sorted by time.
- Parameters
tar_contents (bool, optional) – Load tar contents into TimestreamFile immediately, rather than through TarContentFetcher
- Returns
Timestream files, iteratively
- Return type
- from_inotify()¶
Watch a directory for files using
inotifyand yield new files as they are added to it. Rescans every 10 minutes.- Parameters
basedir (str) – Directory to watch.
- from_fofn()¶
Yields files from a list of file names, most commonly from an index file.
- Parameters
pathorfile (str or
io.IOBase) – Path to an index file, or open file of file names- Returns
Iteratively returns files from file names
- Return type
- getinstant(value)¶
Retrieves files from timestream at a specific time point.
- Parameters
value (
time.TSInstant) – Time point to retrieve- Returns
File if present
- Return type
- index()¶
Update timestream index if empty. Useful before searching a timestream for specific timepoints. Also updates index file if one is set/exists.
- Parameters
progress (bool, optional) – Show progress of indexing with
tqdm
- open(path, format=None)¶
Opens a stored timestream.
- Parameters
path (str) – Path to timestream file/directory
format (str, optional) – Timestream format, if not unarchived or in a tar/zip archive
Utils¶
- pyts2.utils.nowarnings(func)¶
Decorator to always ignore warnings from a given function.
- Parameters
func (function) – Function to decorate
- Returns
Wrapped function
- Return type
function
- pyts2.utils.find_files(base)¶
Generator to iterate over a directory.
- Parameters
base – Path to iterate over.
- Returns
Yields paths to files in directory, or path given if path provided was to a single file.
- class pyts2.utils.CatchSignalThenExit(signals=[<Signals.SIGABRT: 6>, <Signals.SIGINT: 2>, <Signals.SIGTERM: 15>, <Signals.SIGHUP: 1>], exit=True, returncode=1)¶
Context manager to catch any signals, then exit. Primarily used to prevent early external termination of critical processes.
with CatchSignalThenExit(exit=True, returncode=1): do_something_critical()
In the above, if the program receives some signal (SIG{ABRT,INT,TERM,HUP}) during the body of the with statement, then at the close of the with statement, exit with status 1.
- Parameters
signals (list of
signalsignal types) – Signals to catchexit (bool) – If a signal is caught, exit script when object exits? (i.e. end of a with block)
returncode (int) – Code to return when exiting, only used if
exit=True
- handler(*args)¶
Handles signals.
- Parameters
*args – Unused arguments from exceptions if provided
- pyts2.utils.XbyY2XY(xbyy)¶
Converts a string like
10x20into a tuple:(10, 20)- Parameters
xbyy (str) – String to convert.
- Returns
Tuple from string.
- Return type
tuple
>>> XbyY2XY("10x20") (10, 20) >>> XbyY2XY("1X2") (1, 2) >>> XbyY2XY((1, 2)) # Pass pre-tupleised coords through (1, 2)
- pyts2.utils.index2rowcol(index, rows, cols, order='colsright')¶
Converts an index to an x and y within a rows by cols grid, filed in order Everything is zero-based, and coordinates are from top left (a la matricies)
- Parameters
index (int) – Index to convert
rows (int) – Rows of target grid
cols (int) – Cols of target grid
order (str, optional) – Unimplemented. Only accepts
colsright, other orderings to be implemented.
>>> index2rowcol(10, 5, 5) # first row, 3rd col (0, 2) >>> index2rowcol(1, 5, 5, "colsright") # 2nd row, first col (1, 0) >>> index2rowcol(25, 5, 5, "colsright") # past end of matrix Traceback (most recent call last): ... ValueError: index is larger than it should be given rowsXcols
- class pyts2.utils.PathAwareJsonEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)¶
Extends :class:
json.JSONEncoderto encode paths.Constructor for JSONEncoder, with sensible defaults.
If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.
If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.
If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place.
If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.
If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.
If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.
If specified, separators should be an (item_separator, key_separator) tuple. The default is (‘, ‘, ‘: ‘) if indent is
Noneand (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a
TypeError.- default(obj)¶
Encodes Path objects as strings to allow JSON serialisation.
- Parameters
obj (
pathlib.Path) – Object to encode.- Returns
JSON-serialisable version of
obj- Return type
str