API Reference¶
readfish._config
module¶
- class readfish._config.Conf(channels, caller_settings, mapper_settings, split_axis=1, regions=NOTHING, barcodes=NOTHING)[source]¶
Overall configuration for readfish experiments
The Conf class is the mother if the adaptive sampling experiment. It is constructed from the provided
TOML
file, via a call to from_file.- Parameters:
channels (int) – The number of channels on the flow cell
caller_settings (CallerSettings) – The caller settings as listed in the TOML
mapper_settings (MapperSettings) – The mapper settings as listed in the TOML
split_axis (int) – The axis on which to split a flowcell if there are multiple regions. 0 is horizontal, 1 is vertical.
regions (List[Region]) – The regions as listed in the Toml file.
barcodes (Dict[str, Barcode]) – A Dictionary of barcode names to Barcode Classes
_channel_map – A map of channels number (1 to flowcell size) to the index of the Region (in self.regions) they are part of.
- describe_experiment()[source]¶
Describe the experiment from the given Conf class. For Barcodes we describe the targets and the conditions, but not the region.
- Returns:
The description string, human readable.
- Return type:
- classmethod from_dict(dict_, channels)[source]¶
Create the Conf class from a Dictionary
- Parameters:
dict – The dictionary that contains the parsed TOML file
channels (int) – The number of channels on the flow cell
- Raises:
ValueError – If channel is present in the TOML file raise ValueError as it will overwrite something
- Returns:
The constructed Conf class
- Return type:
- classmethod from_file(path, channels, logger=None)[source]¶
Create a Conf from a TOML file.
Loads the toml using rtoml then calls from_dict to create the class.
- get_conditions(channel, barcode)[source]¶
Get the condition for this channel or barcode from the Conf TOML
The barcoder should return the barcode name e.g.
barcode01
orunclassified
if a barcode could not be assigned. If barcoding is not being done then the barcode should beNone
and channel will be used instead.- Parameters:
- Returns control:
Whether this channel/barcode combination is a
control
condition- Returns condition:
- Raises:
ValueError – In the event that the channel/barcode combination does not find a
Region
or aBarcode
- Return type:
- get_targets(channel, barcode)[source]¶
Get the targets for a given channel or barcode, via its condition
- class readfish._config._Condition(name, single_on, single_off, multi_on, multi_off, no_map, no_seq, control=False, targets=NOTHING, min_chunks=1, max_chunks=2, below_min_chunks=Action.proceed, above_max_chunks=Action.unblock)[source]¶
Representation of an experimental condition. This can either be a
Barcode
or an experimentalRegion
of the flow cell.- Parameters:
name (str) – The name of the condition.
single_on (Action) – The
Action
to perform when a read has a single, on-target, alignmentsingle_off (Action) – The
Action
to perform when a read has a single, off-target, alignmentmulti_on (Action) – The
Action
to perform when a read has multiple alignments, with at least one on-targetmulti_off (Action) – The
Action
to perform when a read has multiple aligments, with all off-targetno_map (Action) – The
Action
to perform when a read has no aligmentsno_seq (Action) – The
Action
to perform when a read did not basecallcontrol (bool) – Whether the region should be treated as a control. Defaults to False
targets (Targets) – The target sequences for the condition. See
Targets
for detailsmin_chunks (int) – The minimum number of chunks required before a decision will be made. Defaults to 1
max_chunks (int) – The maximum number of chunks that readfish will assess for any single read. Defaults to 2
below_min_chunks (Action) – The
Action
to take when we haven’t evaluated at leastmin_chunks
. Defaults toAction.proceed
above_max_chunks (Action) – The
Action
to take when we have exceededmax_chunks
. Defaults toAction.unblock
- class readfish._config.Barcode(...)[source]¶
See
_Condition
for details
- class readfish._config.Region(...)[source]¶
See
_Condition
for details
- class readfish._config._PluginModule(name, parameters)[source]¶
A plugin module
- Parameters:
- classmethod from_dict(params)[source]¶
Creates an instance of the _PluginModule class from a dictionary.
- Parameters:
params (Dict[str, Dict]) – A dictionary containing a single key-value pair, where the key is the name of the plugin module and the value is a dictionary of parameters to be passed to the plugin module.
- Raises:
ValueError – If more than one key value pair is provided in the
params
- Returns:
An instance of the
_PluginModule
class with the specified name and parameters.- Return type:
- load_module(override=False)[source]¶
Load a plugin module with the given name.
If the module is a built-in plugin (as specified in the builtins dictionary), it is loaded from the readfish.plugins package. Otherwise, it is loaded using the importlib library.
- Parameters:
override – If True, the built-in module names are ignored. Default is False.
- Returns:
The loaded module.
- Raises:
ModuleNotFoundError – If the plugin module cannot be found or loaded.
Note that this method is intended to be used as part of a plugin system, where plugin modules are loaded dynamically at runtime. The builtins dictionary maps the names of built-in plugins to the actual module names, and is used to avoid having to specify the full module name when loading a built-in plugin. If override=True, the builtin module names are ignored.
- load_object(obj, *, init=True, override=False, **kwargs)[source]¶
Load a specified object from a plugin module.
First
load_module
is called to load the plugin module, then the specified object is retreived from the module.- Parameters:
obj (str) – The name of the object to load from the plugin module.
init (bool) – If True, the returned object is initialized with the parameters provided to the constructor of the parent class, as well as any additional keyword arguments passed in via the
**kwargs
parameter.override (bool) – If True, ignore builtin readfish plugins.
kwargs – Additional keyword arguments to pass to the constructor of the loaded object.
- Returns:
The specified object from the plugin module. If init=True, the object is initialized with the provided parameters and returned.
- Raises:
ModuleNotFoundError – If the plugin module cannot be found or loaded.
AttributeError – If the specified object cannot be found in the plugin module.
TypeError – If the runtime
**kwargs
conflict with the module parameters from the TOML file.
- class readfish._config.CallerSettings(...)[source]¶
See
_PluginModule
for details
- class readfish._config.MapperSettings(...)[source]¶
See
_PluginModule
for details
readfish._loggers
module¶
- readfish._loggers.setup_logger(name, header=None, log_format='%(message)s', log_file=None, log_console=False, mode='a', level=10, propagate=False, queue_bound=100000)[source]¶
Configures and returns a logging.Logger object with handlers specified by the values set in
log_file
andlog_format
, specified format, and level.A custom header can be included if logging to a file. Log messages will be formatted using the provided format string.
- Parameters:
name (str) – Name to assign to the logger.
header (str | None) – Optional header to write at the top of the log file.
log_format (str) – Format string for log messages using % formatting, default is “%(message)s”.
log_file (str | None) – Path to the file where logs should be written.
log_console (bool) – Whether to log to console. If True, a console StreamHandler is added.
mode (str) – Mode to use when opening the log file, default is ‘a’ (append).
level (int) – Logging level, where logging.LEVEL is one of (DEBUG, INFO, WARNING, ERROR, CRITICAL). Default is logging.DEBUG.
propagate (bool) – Whether the logger should propagate messages to higher-level loggers, default is False.
queue_bound (int) – Maximum number of log messages to store in the queue, default is 100_000. If full, adding to queue will block until space is available.
- Returns:
Configured
logging.Logger
instance.- Example:
>>> logger = setup_logger('my_logger', log_console=True, level=logging.INFO) >>> logger.info('This is an info message')
>>> import tempfile >>> with tempfile.NamedTemporaryFile(mode='w+', delete=True) as tmpfile: ... logger = setup_logger('my_logger', log_file=tmpfile.name, header='Time Message', level=logging.INFO)
- Raises:
IOError – If an I/O error occurs while opening or writing to the file.
- Note:
- If log_file is specified, a QueueHandler and QueueListener will be used to send logs to the specified file.
The Queue will be bounded, with a default size of 100_000. Putting to queue will block if full.
If log_file is specified and log_console is False, logs will only be recorded to the specified file.
If log_console is True, logs will be sent to console irrespective of whether log_file is specified.
If log_file is None and log_console is False, logs will be sent to a logging.NullHandler instance.
- If header is provided and the file specified by filename already exists,
the header will not be written to the file.
- Return type:
- readfish._loggers.print_args(args, printer, exclude=None)[source]¶
Prints and formats all arguments from the command line. Takes all entirely lowercase attributes of
args
and prints them using the provided printer function (expected to beprint
orlogging.Logger.info
). The exclude parameter can be used to exclude certain attributes from being printed.
readfish._utils
module¶
utils.py functions and utilities used internally.
- class readfish._utils.Severity(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
Bases:
IntEnum
Severity states for messaging to MinKNOW
- Parameters:
INFO – Info level
WARN – Warn level
ERROR – Error level
- ERROR = 3¶
- INFO = 1¶
- WARN = 2¶
- readfish._utils.compress_and_encode_string(original_str)[source]¶
Compresses a string, encodes it in base-64, and returns an ASCII string representation of the compressed blob.
- readfish._utils.decode_and_decompress_string(encoded_str)[source]¶
Decodes an ASCII string representation of a compressed blob, decompresses it, and returns the original string. This is the reverse of compress_and_encode_string.
- readfish._utils.draw_flowcell_split(flowcell_size, split=1, axis=1, index=0, prefix='\t')[source]¶
Draw unicode representation of the flowcell. If the flowcell is split more than once, and index is passed, the region of the flowcell represented by the index is highlighted solid, whilst the rest is filled with Xs
Rather than representing all the possible channels, we draw a 32 column wide flowcell for gridion and 120 for promethion and divide accordingly
Example
draw_flowcell_split(512)
XXXX XXXX
draw_flowcell_split(512, split = 2) XX00 XX00
draw_flowcell_split(512, split = 2, index = 1) 00XX 00XX
>>> print(draw_flowcell_split(126, 13, index=1, axis=1, prefix="")) .#........... .#........... .#........... .#........... .#...........
>>> print(draw_flowcell_split(126, 5, index=1, axis=0, prefix="")) ............. ............. ............. ############# .............
- Parameters:
- Returns:
String representation of the flowcell in ASCII art
- Return type:
- readfish._utils.escape_message_to_minknow(message, chars)[source]¶
Escape characters in the chars list if they are in message
>>> escape_message_to_minknow("20%", ["%"]) '20\\%' >>> escape_message_to_minknow("20\\%", ["%"]) '20\\%' >>> escape_message_to_minknow("20", ["%"]) '20'
- readfish._utils.format_bases(num, factor=1000, suffix='B')[source]¶
Return a human readable string of a large number using SI unit prefixes
- Pararm num:
A number to convert to decimal form
- Parameters:
- Returns:
The input number formatted to two decimal places with the SI unit and suffix
- Example:
- Return type:
>>> format_bases(1_000) '1.00 kB' >>> format_bases(1_000_000) '1.00 MB' >>> format_bases(1_630_000) '1.63 MB' >>> format_bases(1_000_000_000) '1.00 GB'
- readfish._utils.generate_flowcell(flowcell_size, split=1, axis=1, odd_even=False)[source]¶
Return an list of lists with channels to use in conditions
Representations generated by this method are evenly split based on the physical layout of the flowcell. Each sub-list is the same size. Axis determines whether the flowcell divisions will go left-right (0) or top-bottom (1). As flongle has a shape of (10, 13) the top-bottom axis cannot be split evenly.
- Parameters:
flowcell_size (int) – The total number of channels on the flowcell; 126 for Flongle, 512 for MinION, and 3000 for PromethION
split (int) – The number of sections to split the flowcell into, must be a positive factor of the flowcell dimension, defaults to 1
axis (int) – The axis along which to split, see: https://docs.scipy.org/doc/numpy/glossary.html?highlight=axis, defaults to 1
odd_even (bool) – Return a list of two lists split into odd-even channels, ignores split and axis, defaults to False
- Raises:
ValueError – Raised when split is not a positive integer
ValueError – Raised when the value for split is not a factor on the axis provided
- Returns:
A list of lists with channels divided equally
- Return type:
>>> len(generate_flowcell(512)) 1 >>> len(generate_flowcell(512)[0]) 512 >>> len(generate_flowcell(512, split=4)) 4 >>> for x in generate_flowcell(512, split=4): ... print(len(x)) 128 128 128 128 >>> generate_flowcell(512, split=5) Traceback (most recent call last): ... ValueError: The flowcell cannot be split evenly
>>> for x in generate_flowcell(126, 5, axis=0): ... print(len(x)) 26 26 26 26 26
- readfish._utils.get_coords(channel, flowcell_size)[source]¶
Return a channel’s coordinates given a flowcell size
- Parameters:
- Returns:
The column and row of a channel number in the flowcell
- Raises:
ValueError – channel cannot be below 0 or above flowcell_size
ValueError – Raised if flowcell_size not one of [128, 512, 3000]
- Return type:
- readfish._utils.get_device(device, host='127.0.0.1', port=None)[source]¶
Get a position for a specific device over the minknow API
- Parameters:
- Raises:
ValueError – If their is no match on any of the positions for the given device name
- Returns:
The position representation from the MinkKNOW API
- Return type:
FlowCellPosition
- readfish._utils.get_flowcell_array(flowcell_size)[source]¶
Return a numpy.ndarray in the shape of a flowcell
- Parameters:
flowcell_size (int) – The total number of channels on the flowcell; 126 for Flongle, 512 for MinION, and 3000 for PromethION
- Returns:
An N-dimensional array representation of the flowcell
- Return type:
_SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]
>>> get_flowcell_array(126).shape (10, 13) >>> get_flowcell_array(512).shape (16, 32) >>> get_flowcell_array(3000).shape (25, 120) >>> get_flowcell_array(128) Traceback (most recent call last): ... ValueError: flowcell_size is not recognised >>> get_flowcell_array(126)[9][-1] 0 >>> get_flowcell_array(512)[15][-1] 1
- readfish._utils.iter_exception_group(exc, level=0)[source]¶
Traverses an exception tree, yielding formatted strings for each exception encountered
- Parameters:
exc (BaseExceptionGroup) – The exception group to traverse
level (int) – The current indentation level, defaults to 0, defaults to 0
- Yield:
Formatted (and indented) string representation of each exception encountered in the tree.
>>> exc = BaseExceptionGroup( ... "level 1.0", ... [ ... BaseExceptionGroup( ... "level 2.0", ... [ ... BaseExceptionGroup( ... "level 3.0", ... [ ... ValueError("abc"), ... KeyError("99"), ... BaseExceptionGroup("level 4.0", [TabError("nu uh")]), ... ], ... ) ... ], ... ), ... BaseExceptionGroup("level 2.1", [ValueError("345")]), ... ], ... ) >>> print("\n".join(iter_exception_group(exc))) level 1.0 (2 sub-exceptions): level 2.0 (1 sub-exception): level 3.0 (3 sub-exceptions): - ValueError('abc') - KeyError('99') level 4.0 (1 sub-exception): - TabError('nu uh') level 2.1 (1 sub-exception): - ValueError('345')
- readfish._utils.nested_get(obj, key, default=None, *, delim='.')[source]¶
Get a value from a nested structure
>>> class C: ... def __init__(self, x=None): ... self.x = x ... def __repr__(self): return f"C(x={self.x!r})" >>> data = {"a": {"b": {"c": "d", "e": C(999)}}} >>> cls = C(C(data)) >>> nested_get(data, "a.b.c") 'd' >>> nested_get(data, "a.b.c", 0) 'd' >>> nested_get(data, "a.b.c.d.e", 0) 0 >>> nested_get(cls, "x.x") {'a': {'b': {'c': 'd', 'e': C(x=999)}}} >>> nested_get(cls, "x.x.a.b.e.x") 999 >>> nested_get(cls, "missing", "MISSING") 'MISSING'
- Parameters:
readfish._cli_args
module¶
Store for command line arguments and defaults, these are used by readfish entry points.
These are held here in an agnostic format and the actual CLI is generated by readfish._cli_base
.
The two primary items that are exported are BASE_ARGS
and DEVICE_BASE_ARGS
which define different sets of command line arguments for different purposes.
BASE_ARGS
are the minimal required arguments for _all_ entry points as they used for initialising loggers.
DEVICE_BASE_ARGS
are the set of arguments that are used for connecting to a sequencer (device) and some other related settings for selective sequencing scripts.
- class readfish._cli_args.Chemistry(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
Bases:
Enum
- DUPLEX = 'duplex'¶
For the “smarter” version of duplex - does this read map to the previous reads opposite strand on the same contig. Won’t work for no map based decisions
- DUPLEX_SIMPLE = 'duplex_simple'¶
Simple duplex - if we are going to unblock a read given the previous read on the same channel was stop receiving, sequence the current read instead.
- SIMPLEX = 'simplex'¶
Normal simplex chemistry - no duplex override shenanigans
readfish._statistics
module¶
This module contains the ReadfishStatistics class, which is designed to track and manage statistics pertaining to a single Readfish run. The ReadfishStatistics class is able to update and query various statistics and counters regarding the performance, decisions, actions, and conditions of Readfish runs.
The ReadfishStatistics class has the ability to compute and return averages related to chunks per second, batch time, and batch size, and it maintains various counters to keep track of the number of chunks processed, actions taken, decisions made, and conditions met. The class also facilitates the addition of new performance and read records to the existing statistics.
- Example:
>>> from readfish._statistics import ReadfishStatistics, DEBUG_STATS_LOG_FIELDS >>> stats = ReadfishStatistics(None) >>> stats.add_batch_performance(1,1) >>> stats.log_read(**dict(zip(DEBUG_STATS_LOG_FIELDS, (1, 2, "test_read_id", 7, 100, 3, "single_on", "stop_receiving", "exp_region", None, None, False, 0.0))), region_name="naff", overridden_action_name=None) >>> print(stats.get_batch_performance()) 0001R/1.0000s; Avg: 0001R/1.0000s; Seq:1; Unb:0; Pro:0; Slow batches (>1.00s): 0/1 >>> print(stats.decisions) Counter({'single_on': 1})
- class readfish._statistics.ReadfishStatistics(log_file, break_reads_seconds=1.0, total_chunks=0, actions=NOTHING, conditions=NOTHING, actions_conditions=NOTHING, decisions=NOTHING, first_read_overrides=NOTHING, batch_statistics=NOTHING, lock=NOTHING)[source]¶
Bases:
object
A class for tracking and managing statistics for individual Readfish runs.
The ReadfishStatistics class is designed to manage and present statistics from individual Readfish runs, providing insights into performance, decisions, actions, and conditions encountered during the runs.
- Variables:
break_reads_seconds – The number of seconds between each collection of chunk signal. Default 1.0.
log_file – The name of the log file to write to. If None, no file is output.
total_chunks – The total number of chunks processed.
actions – A counter tracking the number of
Actions
sent.conditions – A counter tracking the number of reads seen for each
Condition
.actions_conditions – A counter tracking number of condition/action/decision combinations.
decisions – A counter tracking the number of decisions made, globally.
first_read_overrides – A counter tracking whether the first read was sequences readfish was started during sequencing or unblocked if it was not.
batch_statistics – A counter tracking performance metrics such as summed batch times, total chunks in batches, and number of batches seen.
Example:
>>> stats = ReadfishStatistics(None) >>> stats.add_batch_performance(1, 1) >>> stats.log_read(**dict(zip(DEBUG_STATS_LOG_FIELDS, (1, 2, "test_read_id",7, 100, 3, "single_on", "stop_receiving", "exp_region", None, None,False, 0.0))), region_name="naff", overridden_action_name=None) >>> print(stats.get_batch_performance()) 0001R/1.0000s; Avg: 0001R/1.0000s; Seq:1; Unb:0; Pro:0; Slow batches (>1.00s): 0/1 >>> print(stats.decisions) Counter({'single_on': 1})
Example with log file
>>> import tempfile >>> import os >>> import time >>> from pprint import pformat >>> from readfish._statistics import ReadfishStatistics, DEBUG_STATS_LOG_FIELDS >>> with tempfile.TemporaryDirectory() as tmpdir: ... # Change the current working directory to the temporary directory ... os.chdir(tmpdir) ... # Use the current directory for the log file ... log_file_name = "readfish.log" ... # Create an instance of ReadfishStatistics with the log file in the temporary directory ... stats = ReadfishStatistics(log_file=log_file_name) ... # Use the log_read method to log a sample read ... stats.log_read(**dict(zip(DEBUG_STATS_LOG_FIELDS,(1, 2, "test_read_id", 7, 100, 3, "single_on", "stop_receiving", "exp_region",None, None, False, 0.0))), region_name="naff", overridden_action_name=None) ... # in this test, we need a small amount of time to allow the logger to write the file ... time.sleep(0.1) ... # Read the content of the file ... with open(log_file_name, 'r') as log_file: ... content = log_file.read() ... # Prepare the expected content ... header = " ".join(DEBUG_STATS_LOG_FIELDS) ... expected_line = " ".join(map(str, (1, 2, "test_read_id", 7, 100, 3, "single_on","stop_receiving", "exp_region", None, None, False, 0.0))) ... expected = f"{header}\n{expected_line}" ... # Check that the content matches, don't ask about the replaces, it was the only way ... expected.replace(" ", " ") == content.replace("\t", " ").strip() True
- property average_chunks_per_second: float¶
Calculate and return the average number of chunks processed per second.
- Returns:
Average number of chunks processed per second.
Given: batch_statistics = {“batch_count”: 2, “batch_size”: 100, “batch_time”: 50}
>>> stats = ReadfishStatistics(None) >>> stats.add_batch_performance(number_of_reads=10, batch_time=5) >>> stats.average_chunks_per_second 2.0
More complex example:
>>> stats = ReadfishStatistics(None) >>> stats.add_batch_performance(number_of_reads=10, batch_time=5) >>> stats.add_batch_performance(number_of_reads=10, batch_time=5) >>> stats.add_batch_performance(number_of_reads=40, batch_time=5) >>> stats.average_chunks_per_second 4.0
When batch_count is 0, the result will be 0.
>>> stats.batch_statistics["batch_count"] = 0 >>> stats.average_chunks_per_second 0
- property average_batch_time: float¶
Calculate and return the average time taken per batch.
Examples:
Given: batch_statistics = {“batch_count”: 3, “cumulative_batch_size”: 150, “cumulative_batch_time”: 60}
>>> stats = ReadfishStatistics(None) >>> stats.batch_statistics = {"batch_count": 3, "cumulative_batch_size": 150, "cumulative_batch_time": 60} >>> stats.average_batch_time 20.0
When batch_count is 0, the result should be 0.
>>> stats.batch_statistics["batch_count"] = 0 >>> stats.average_batch_time 0
- property average_batch_size: float¶
Calculate and return the average size of processed batches.
The method computes the average batch size by dividing the total number of chunks processed by the number of batches seen. If no batches have been processed, the method returns 0.
- Returns:
Average number of reads processed per batch.
Example:
>>> stats = ReadfishStatistics(None) >>> stats.average_batch_size 0 >>> stats.add_batch_performance(50, 20.0) >>> stats.add_batch_performance(100, 20.0) >>> stats.average_batch_size 75.0
- get_batch_performance()[source]¶
Generate and return a formatted string representing batch performance.
If no batches have been processed, a placeholder message is returned.
- Returns:
String summary of the current performance metrics.
- Return type:
Examples:
When no batches have been processed:
>>> stats = ReadfishStatistics(None) >>> stats.batch_statistics = {"batch_count": 0, "cumulative_batch_size": 0, "cumulative_batch_time": 0, "batch_size": 0, "batch_time": 0} >>> stats.get_batch_performance() 'No performance data yet'
When 100 chunks is processed in 10 seconds and it has been lagging for 6 consecutive batches:
>>> stats = ReadfishStatistics(None) >>> stats.batch_statistics.update({"batch_count": 6, "cumulative_batch_size": 100, "cumulative_batch_time": 10, "batch_size": 10, "batch_time": 10, "cumulative_lagging_batches": 6, "consecutive_lagging_batches": 6}) >>> stats.get_batch_performance() '0010R/10.0000s; Avg: 0016R/1.6667s; Seq:0; Unb:0; Pro:0; Slow batches (>1.00s): 6/6'
When three batches of total 300 chunks are processed in a total of 45 seconds:
>>> stats = ReadfishStatistics(None) >>> stats.batch_statistics.update({"batch_count": 3, "cumulative_batch_size": 300, "cumulative_batch_time": 45, "batch_size": 300, "batch_time": 45}) >>> stats.get_batch_performance() '0300R/45.0000s; Avg: 0100R/15.0000s; Seq:0; Unb:0; Pro:0; Slow batches (>1.00s): 0/3'
When five batches of total 500 chunks are processed in a total of 120 seconds:
>>> stats = ReadfishStatistics(None) >>> stats.batch_statistics.update({"batch_count": 5, "cumulative_batch_size": 500, "cumulative_batch_time": 120, "batch_size": 500, "batch_time": 120}) >>> stats.get_batch_performance() '0500R/120.0000s; Avg: 0100R/24.0000s; Seq:0; Unb:0; Pro:0; Slow batches (>1.00s): 0/5'
- add_batch_performance(number_of_reads, batch_time)[source]¶
Update the collected statistics with new batch performance data.
This method integrates a new set of chunk batch performance metrics into the class’s statistics, specifically updating the batch size, batch time, and batch count based on the provided number of reads and the time taken.
- Parameters:
Example:
>>> stats = ReadfishStatistics(None) >>> stats.add_batch_performance(100, 10.5) >>> stats.batch_statistics Counter({'cumulative_batch_size': 100, 'batch_size': 100, 'cumulative_batch_time': 10.5, 'batch_time': 10.5, 'batch_count': 1, 'cumulative_lagging_batches': 1, 'consecutive_lagging_batches': 1}) >>> stats.add_batch_performance(100, 10.5) >>> stats.batch_statistics Counter({'cumulative_batch_size': 200, 'batch_size': 100, 'cumulative_batch_time': 21.0, 'batch_time': 10.5, 'batch_count': 2, 'cumulative_lagging_batches': 2, 'consecutive_lagging_batches': 2})
- log_read(region_name, overridden_action_name, **kwargs)[source]¶
Add a new read chunk record into the collected statistics, and log it to the debug logger.
The following terms are used in this function: decision is expected to be one of Unblock, stop_receiving etc. mode is expected to be one of single_on, single_off, multi_on etc.
The term “action” is used to describe what the sequencer actually did. #ToDo: see and address issue #298 :param region_name: The name of the region on the flow cell. :param overridden_action_name: Optional, if the originally determined action was overridden, the name of the NEW action.
readfish._cli_base
module¶
Main entry point for command line read until scripts.
Set as entrypoint in pyproject.toml
- readfish._cli_base.main(argv=None)[source]¶
Main function for entry point of the read until scripts.
- Parameters:
- Raises:
SystemExit – Raises a system exit when the command function exits.