fireworks.features package¶
Subpackages¶
Submodules¶
fireworks.features.background_task module¶
-
class
fireworks.features.background_task.
BackgroundTask
(tasks, num_launches=0, sleep_time=60, run_on_finish=False)¶ Bases:
fireworks.utilities.fw_serializers.FWSerializable
,object
-
classmethod
from_dict
(*args, **kwargs)¶
-
to_dict
(*args, **kwargs)¶
-
classmethod
fireworks.features.dupefinder module¶
-
class
fireworks.features.dupefinder.
DupeFinderBase
¶ Bases:
fireworks.utilities.fw_serializers.FWSerializable
This serves an Abstract class for implementing Duplicate Finders
-
classmethod
from_dict
(m_dict)¶
-
query
(spec)¶ Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.
Parameters: spec (dict) – spec to check for duplicates
-
to_dict
(*args, **kwargs)¶
-
verify
(spec1, spec2)¶ Method that checks whether two specs are identical enough to be considered duplicates. Return true if duplicated.
Args: spec1 (dict) spec2 (dict)
Returns: bool
-
classmethod
fireworks.features.fw_report module¶
-
class
fireworks.features.fw_report.
FWReport
(lpad)¶ -
get_stats
(coll='fireworks', interval='days', num_intervals=5, additional_query=None)¶ Compile statistics of completed Fireworks/Workflows for past <num_intervals> <interval>, e.g. past 5 days.
Parameters: Returns: list, with each item being a dictionary of statistics for a given interval
-
fireworks.features.introspect module¶
-
class
fireworks.features.introspect.
Introspector
(lpad)¶ -
introspect_fizzled
(coll='fws', rsort=True, threshold=10, limit=100)¶
-
print_report
(table, coll)¶
-
-
fireworks.features.introspect.
collect_stats
(list_keys, filter_truncated=True)¶ Turns a list of keys (from flatten_to_keys) into a dict of <str>:count, i.e. counts the number of times each key appears.
Parameters: - list_keys –
- filter_truncated (bool) –
Returns: dict
-
fireworks.features.introspect.
compare_stats
(statsdict1, numsamples1, statsdict2, numsamples2, threshold=5)¶
fireworks.features.multi_launcher module¶
-
fireworks.features.multi_launcher.
launch_multiprocess
(launchpad, fworker, loglvl, nlaunches, num_jobs, sleep_time, total_node_list=None, ppn=1, timeout=None, exclude_current_node=False)¶ Launch the jobs in the job packing mode.
Parameters: - launchpad (LaunchPad) –
- fworker (FWorker) –
- loglvl (str) – level at which to output logs
- nlaunches (int) – 0 means ‘until completion’, -1 or “infinite” means to loop forever
- num_jobs (int) – number of sub jobs
- sleep_time (int) – secs to sleep between rapidfire loop iterations
- total_node_list ([str]) – contents of NODEFILE (doesn’t affect execution)
- ppn (int) – processors per node (doesn’t affect execution)
- timeout (int) – # of seconds after which to stop the rapidfire process
- exclude_current_node – Don’t use the script launching node as a compute node
-
fireworks.features.multi_launcher.
ping_multilaunch
(port, stop_event)¶ A single manager to ping all launches during multiprocess launches
Parameters: - port (int) – Listening port number of the DataServer
- stop_event (Thread.Event) – stop event
-
fireworks.features.multi_launcher.
rapidfire_process
(fworker, nlaunches, sleep, loglvl, port, node_list, sub_nproc, timeout, running_ids_dict)¶ Initializes shared data with multiprocessing parameters and starts a rapidfire.
Parameters: - fworker (FWorker) – object
- nlaunches (int) – 0 means ‘until completion’, -1 or “infinite” means to loop forever
- sleep (int) – secs to sleep between rapidfire loop iterations
- loglvl (str) – level at which to output logs to stdout
- port (int) – Listening port number of the shared object manage
- password (str) – security password to access the server
- node_list ([str]) – computer node list
- sub_nproc (int) – number of processors of the sub job
- timeout (int) – # of seconds after which to stop the rapidfire process
-
fireworks.features.multi_launcher.
split_node_lists
(num_jobs, total_node_list=None, ppn=24)¶ Parse node list and processor list from nodefile contents
Parameters: Returns: (([int],[int])) the node list and processor list for each job
-
fireworks.features.multi_launcher.
start_rockets
(fworker, nlaunches, sleep, loglvl, port, node_lists, sub_nproc_list, timeout=None, running_ids_dict=None)¶ Create each sub job and start a rocket launch in each one
Parameters: - fworker (FWorker) – object
- nlaunches (int) – 0 means ‘until completion’, -1 or “infinite” means to loop forever
- sleep (int) – secs to sleep between rapidfire loop iterations
- loglvl (str) – level at which to output logs to stdout
- port (int) – Listening port number
- node_lists ([str]) – computer node list
- sub_nproc_list ([int]) – list of the number of the process of sub jobs
- timeout (int) – # of seconds after which to stop the rapidfire process
- running_ids_dict (dict) – Shared dict between process to record IDs
Returns: ([multiprocessing.Process]) all the created processes
fireworks.features.stats module¶
-
class
fireworks.features.stats.
FWStats
(lpad)¶ -
get_daily_completion_summary
(query_start=None, query_end=None, query=None, time_field=u'time_end', **args)¶ Get daily summary of fireworks for a specified time range :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “time_end”. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of daily fireworks stats for the specified time range.
-
get_fireworks_summary
(query_start=None, query_end=None, query=None, time_field=u'updated_on', **args)¶ Get fireworks summary for a specified time range.
Parameters: - query_start (str) – The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time.
- query_end (str) – The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time.
- query (dict) – Additional Pymongo queries to filter entries for process.
- time_field (str) – The field to query time range. Default is “updated_on”.
- args (dict) – Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days.
Returns: (list) A summary of fireworks stats for the specified time range.
-
get_launch_summary
(query_start=None, query_end=None, time_field=u'time_end', query=None, runtime_stats=False, include_ids=False, **args)¶ Get launch summary for a specified time range.
Parameters: - query_start (str) – The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time.
- query_end (str) – The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time.
- time_field (str) – The field to query time range. Default is “time_end”.
- query (dict) – Additional Pymongo queries to filter entries for process.
- runtime_stats (bool) – If return runtime stats. Default is False.
- include_ids (bool) – If return fw_ids. Default is False.
- args (dict) – Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days.
Returns: (list) A summary of launch stats for the specified time range.
-
get_workflow_summary
(query_start=None, query_end=None, query=None, time_field=u'updated_on', **args)¶ Get workflow summary for a specified time range. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “updated_on”. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of workflow stats for the specified time range.
-
group_fizzled_fireworks
(group_by, query_start=None, query_end=None, query=None, include_ids=False, **args)¶ Group fizzled fireworks for a specified time range by a specified key. :param group_by: (str) Database field used to group fireworks items. :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param include_ids: (bool) If return fw_ids. Default is False. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) A summary of fizzled fireworks for group by the specified key.
-
identify_catastrophes
(error_ratio=0.01, query_start=None, query_end=None, query=None, time_field=u'time_end', include_ids=True, **args)¶ Get days with higher failure ratio :param error_ratio: (float) Threshold of error ratio to define as a catastrophic day :param query_start: (str) The start time (inclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is 30 days before current time. :param query_end: (str) The end time (exclusive) to query in isoformat (YYYY-MM-DDTHH:MM:SS.mmmmmm). Default is current time. :param query: (dict) Additional Pymongo queries to filter entries for process. :param time_field: (str) The field to query time range. Default is “time_end”. :param include_ids: (bool) If return fw_ids. Default is False. :param args: (dict) Time difference to calculate query_start from query_end. Accepts arguments in python datetime.timedelta function. args and query_start can not be given at the same time. Default is 30 days. :return: (list) Dates with higher failure ratio with optional failed fw_ids.
-