Export steps¶
Steps required to export data to various destinations.
ExportStep¶
-
class
ExportStep(model, query=None, search=None, object_ids=None)[source]¶ Fetch the requested data from the database.
This step will query the database and pass on the raw returned query results.
In an export pipeline, the first step will usually be the ExportStep itself. This step gathers data from the database to be converted and formatted further at a later point in the pipeline.
You can either supply a query argument containing a MongoEngine query
or a search argument with options used to construct an ElasticSearch query.
Arguments:
str model: A model name or identifier, as used in the generic RESTful API endoints.
list query: A list of actions to perform on the model class directly as queries. Example:
"query": [ ["objects", { "name__startswith": "A", }], ["order_by", ["last_login"]] ]
dict search: An ElasticSearch query object
Added context:
str mode: Will be set to
'export'- dict export
type model: Will be set to the model class
str model_name: Will be set to the model name (str)
ConvertStep¶
-
class
ConvertStep(attributes, model=None)[source]¶ Prepare a list of raw attribute values for all documents.
The converter step is responsible for transforming the acquired data to a more appropriate format, only containing fields we actually requested.
Arguments:
- list attributes: A list of attributes to extract. Can be a dotted-path to fetch
values from a related model field. E.g.
['foo', 'user.name']
Added context:
- dict convert
list attributes: Will be a copy of the supplied arguments list.
FormatStep¶
-
class
FormatStep(target=None, depth=None, collapse=False, value_map=None, **options)[source]¶ Format data in a flat or nested hierarchical structure.
- TODO: Infinite recursion detection and prevention:
To prevent exporting recursive documents, there must be a filter present for the attribute that points to the same class of documents.
Example: Exporting page with attribute parent must set a filter in the likes of parent=None (which only gets applied to the root documents being exported).
-
__init__(target=None, depth=None, collapse=False, value_map=None, **options)[source]¶ Configure data conversion for this step.
- Parameters
target (str) – Currently supported is ‘text’ and ‘json’ Defaults to text, which will convert everything to str before writing.
depth (int) – Depth zero will yield a flat list of values, otherwise will export selected attributes up to the specified depth. Set depth to non-positive value for no restriction!
collapse (bool) – If True, empty nested trees will just collapse to a single NULL. WARNING: This will potentially remove keys from your output! Do not use if you rely on all columns defined to be present (albeit NULL).
value_map (dict) – Direct value mapping, e.g. None to ‘NULL’.
options (dict) – Additional options that will be passed to datatype converters.
Formatting options can be supplied to this step, so data will look just like you requested.
Arguments:
target, depth, collapse, value_map, **options
- str target: (optional) A string identifying a known text format to export to, possible values are:
'text'This will coerce all data withstr()before writing'json'The JSON format will leave some types alone, so they can be exported with their appropriate type still intact in a JSON file.
- int depth: (optional) Define how deep nested attributes will be parsed,
0 means no nesting, -1 will be unlimited.
- bool collapse: If the collapse flag is set and a relationship in an object
resolves to None, the complete dict will be set to None instead of their individual sub-properties (has no effect when depth=0).
- dict value_map: (optional) A dict or list of pairs containing replacement values for
specific other values found in attributes. Use this to show ‘YES’ for values like True. Example:
"value_map": [ [null, "[NULL]"], [true, "[TRUE]"], [false, "[FALSE]"] ],
- **options: Further options can be supplied, they will be passed down to the individual datatype
formatters. Currently supported are for example:
dateformat
datetimeformat
timeformat
Added context:
- dict format
int depth: Depth of data structures
bool flat: Will be true if depth is 0
FileWriteStep¶
-
class
FileWriteStep(load_class, options=None)[source] Writes the data acquired so far to a file.
-
__init__(load_class, options=None)¶ Load the configurable class and configure it.
-
This step will write the current state to a file using a writer of your choice.
See also
Arguments:
- str load_class: The name of a writer class to use. Can also be
a fully qualified import path.
- dict options: Options to be passed down to the writer class
str filename: Filename to save to (relative path)
Added context:
- dict write:
Writer writer: The writer class that was used
str filename: The absolute path of the file that was written to
str mime_type: MIME type of the written file
RenameStep¶
-
class
RenameStep(template)[source]¶ Rename an exported file according to some formatting options.
This step can usually be applied directly after the FileWriteStep and it will rename the written file according to a template.
-
__init__(template)[source]¶ Set a template string to use as a new filename.
- Parameters
template (str) – The filename template.
The template can contain any strftime format identifier as well as these special placeholders:
{model}: Model name{model_lower}: Model name (lowercase){model_upper}: Model name (uppercase){model_key}: Model key (lowercase)
-
Added context:
This step adds no context.
ZipStep¶
-
class
ZipStep[source]¶ Zip the exported file.
TODO: The original file will not be deleted in the export directory.
-
__init__¶ Initialize self. See help(type(self)) for accurate signature.
-
This step will zip the file found at the current state.
Arguments:
This step takes no arguments.
Added context:
- dict zip:
str filename: The new filename for the zipped file.
XSLTExportStep¶
-
class
XSLTExportStep(filepath=None, file=None)[source]¶ XSLT step that gets xslt as string for XML transformation.
This step gets a XSLT template as string and uses it for XML transformation.
Arguments:
str xslt: XSLT transformation template.
Added context:
This step adds no context.