polymatheia.data.writer

This module provides a writer for serialising data to the local filesystem.

class polymatheia.data.writer.CSVWriter(target, default_value='', extras_action='ignore', column_names=None)

The CSVWriter writes records into a CSV file.

The CSVWriter assumes that no record contains any kind of nested data. If it is passed nested data, then the behaviour is undefined.

__init__(target, default_value='', extras_action='ignore', column_names=None)

Create a new CSVWriter.

Parameters:
  • target – The target to write the CSV to. Can either be a str filename or an existing file-like object

  • default_value – The default value to output if a record does not contain a value for a specified CSV column name

  • extras_action (str) – The action to take if a record contains keys that are not in the CVS fieldnames. Set to 'ignore' to just ignore this (the default). Set to 'raise' to raise a ValueError.

  • fieldnames (list of str) – The CSV column names to use. If None is specified, then the column names are derived from the first record’s keys.

__weakref__

list of weak references to the object (if defined)

write(records)

Write the records to the CSV file.

Parameters:

records (Iterable of NavigableDict) – The records to write

class polymatheia.data.writer.JSONWriter(directory, id_path)

The JSONWriter writes records to the local filesystem as JSON files.

__init__(directory, id_path)

Create a new JSONWriter.

For each record the identifier is used to create a directory structure. In the leaf directory the identifier is then used as the filename.

Parameters:
  • directory (str) – The base directory within which to create the files

  • id_path (str or list) – The path used to access the identifier in the record

__weakref__

list of weak references to the object (if defined)

write(records)

Write the records to the file-system.

Parameters:

records (Iterable of NavigableDict) – The records to write

class polymatheia.data.writer.PandasDFWriter

The PandasDFWriter writes records to a Pandas DataFrame.

The PandasDFWriter attempts to automatically coerce columns to integers or floats.

The PandasDFWriter assumes that no record contains any kind of nested data. If it is passed nested data, then the behaviour is undefined.

__init__()

Create a new PandasDFWriter.

__weakref__

list of weak references to the object (if defined)

write(records)

Write the records to the Pandas DataFrame.

Parameters:

records (Iterable of NavigableDict) – The records to write

Returns:

The Pandas dataframe

Return type:

DataFrame

class polymatheia.data.writer.XMLWriter(directory, id_path)

The XMLWriter writes records to the local filesystem as XML.

__init__(directory, id_path)

Create a new XMLWriter.

For each record the identifier is used to create a directory structure. In the leaf directory the identifier is then used as the filename.

Parameters:
  • directory (str) – The base directory within which to create the files

  • id_path (str or list) – The path used to access the identifier in the record

__weakref__

list of weak references to the object (if defined)

write(records)

Write the records to the file-system.

Parameters:

records (Iterable of NavigableDict) – The records to write