polymatheia.data.reader

This module provides a range of readers for accessing local and remote resources.

All readers return their data as NavigableDict.

class polymatheia.data.reader.CSVReader(source)

The CSVReader provides access to a CSV file.

__init__(source)

Create a new CSVReader.

Parameters:

source – The source to load the CSV from. Can either be a str filename or a file-like object

__iter__()

Return this CSVReader as the iterator.

__weakref__

list of weak references to the object (if defined)

class polymatheia.data.reader.EuropeanaSearchIterator(api_key, query, max_records=None, query_facets=None, media=None, thumbnail=None, reusability=None, profile=None)

The EuropeanaSearchIterator provides an iterator for the Europeana Search API.

The initial search is run immediately on creating a new EuropeanaSearchIterator. The iterator will automatically paginate through the full set of result pages.

result_count: int

The total number of records returned by the search.

facets: polymatheia.data.NavigableDict

The facets generated by the search. This is only set if the profile parameter is set to 'facets'.

__init__(api_key, query, max_records=None, query_facets=None, media=None, thumbnail=None, reusability=None, profile=None)

Create a new EuropeanaSearchReader.

Parameters:
  • api_key (str) – The Europeana API key

  • query (str) – The query string

  • max_records (int) – The maximum number of records to return. Defaults to all records

  • query_facets (list of str) – The list of query facets to apply to the search

  • media (bool) – Whether to require that matching records have media attached. Defaults to no requirement

  • thumbnail (bool) – Whether to require that matching records have a thumbnail. Defaults to no requirement

  • reusability (str) – The reusability (rights) to require. Defaults to no limits

  • profile (str) – The result profile to request. Defaults to 'standard'

__iter__()

Return this EuropeanaSearchIterator as the iterator.

__next__()

Return the next record as a NavigableDict.

Raises:

StopIteration – If no more Records are available

__weakref__

list of weak references to the object (if defined)

class polymatheia.data.reader.EuropeanaSearchReader(api_key, query, max_records=None, query_facets=None, media=None, thumbnail=None, reusability=None, profile=None)

The EuropeanaSearchReader provides access to the Europeana Search API.

The initial search is run immediately on creating a new EuropeanaSearchReader. The iterator will automatically paginate through the full set of result pages.

result_count: int

The total number of records returned by the search.

facets: polymatheia.data.NavigableDict

The facets generated by the search. This is only set if the profile parameter is set to 'facets'.

__init__(api_key, query, max_records=None, query_facets=None, media=None, thumbnail=None, reusability=None, profile=None)

Create a new EuropeanaSearchReader.

Parameters:
  • api_key (str) – The Europeana API key

  • query (str) – The query string

  • max_records (int) – The maximum number of records to return. Defaults to all records

  • query_facets (list of str) – The list of query facets to apply to the search

  • media (bool) – Whether to require that matching records have media attached. Defaults to no requirement

  • thumbnail (bool) – Whether to require that matching records have a thumbnail. Defaults to no requirement

  • reusability (str) – The reusability (rights) to require. Defaults to no limits

  • profile (str) – The result profile to request. Defaults to 'standard'

__iter__()

Return this EuropeanaSearchReader as the iterator.

__weakref__

list of weak references to the object (if defined)

class polymatheia.data.reader.JSONReader(directory)

The JSONReader is a container for reading JSON files from the filesystem.

It is designed to provide access to data serialised using the JSONWriter.

Important

It does not guarantee that the order of records is the same as the order in which they were written to the local filesystem.

__init__(directory)

Create a new JSONReader.

Parameters:

directory (str) – The base directory within which to load the files

__iter__()

Return a new NavigableDictIterator as the iterator.

__weakref__

list of weak references to the object (if defined)

class polymatheia.data.reader.OAIMetadataFormatReader(url)

The class:~polymatheia.data.reader.OAIMetadataFormatReader is a container for OAI-PMH MetadataFormat.

The underlying library automatically handles the continuation parameters, allowing for simple iteration.

__init__(url)

Construct a new class:~polymatheia.data.reader.OAIMetadataFormatReader.

Parameters:

url (str) – The base URL of the OAI-PMH server

__iter__()

Return a new class:~polymatheia.data.NavigableDictIterator as the iterator.

__weakref__

list of weak references to the object (if defined)

class polymatheia.data.reader.OAIRecordReader(url, metadata_prefix='oai_dc', max_records=None, set_spec=None)

The OAIRecordReader is an iteration container for OAI-PMH Records.

The underlying library automatically handles the continuation parameters, allowing for simple iteration.

__init__(url, metadata_prefix='oai_dc', max_records=None, set_spec=None)

Construct a new OAIRecordReader.

Parameters:
  • url (str) – The base URL of the OAI-PMH server

  • metadataPrefix (str) – The metadata prefix to use for accessing data

  • max_records (int) – The maximum number of records to return. Default (None) returns all records

  • set_spec (str) – The OAI Set specification for limiting which metadata to fetch

__iter__()

Return a new class:~polymatheia.data.NavigableDictIterator as the iterator.

If max_records is set, then the class:~polymatheia.data.NavigableDictIterator is wrapped in a class:~polymatheia.data.LimitingIterator.

__weakref__

list of weak references to the object (if defined)

class polymatheia.data.reader.OAISetReader(url)

The class:~polymatheia.data.reader.OAISetReader is an iteration container for OAI-PMH Sets.

The underlying library automatically handles the continuation parameters, allowing for simple iteration.

__init__(url)

Construct a new class:~polymatheia.data.reader.OAISetReader.

Parameters:

url (str) – The base URL of the OAI-PMH server

__iter__()

Return a new class:~polymatheia.data.NavigableDictIterator as the iterator.

__weakref__

list of weak references to the object (if defined)

class polymatheia.data.reader.SRUExplainRecordReader(url)

The class:~polymatheia.data.reader.SRUExplainRecordReader is a container for SRU Explain Records.

__init__(url)

Construct a new class:~polymatheia.data.reader.SRUExplainRecordReader.

Parameters:

url (str) – The base URL of the SRU server

__iter__()

Return a new class:~polymatheia.data.NavigableDictIterator as the iterator.

__weakref__

list of weak references to the object (if defined)

class polymatheia.data.reader.SRURecordReader(url, query, max_records=None, record_schema='dc', **kwargs)

The SRURecordReader is an iteration container for Records fetched via SRU.

The underlying library (SRUpy) automatically handles the continuation parameters, allowing for simple iteration.

__init__(url, query, max_records=None, record_schema='dc', **kwargs)

Construct a new SRURecordReader.

Parameters:
  • url (str) – The base URL of the SRU endpoint

  • query (str) – The query string

  • max_records (int) – The maximum number of records to return

  • record_schema (str) – Schema in which records will be returned. Defaults to Dublin Core schema.

  • kwargs – Additional request parameters that will be sent to the SRU server

__iter__()

Return a new class:~polymatheia.data.NavigableDictIterator as the iterator.

__weakref__

list of weak references to the object (if defined)

static result_count(url, query)

Return result count for the given query.

Parameters:
  • url (str) – The base URL of the SRU endpoint

  • query (str) – The query string

class polymatheia.data.reader.XMLReader(directory)

The XMLReader is a container for reading XML files from the local filesystem.

The XMLReader will only load files that have a “.xml” extension.

__init__(directory)

Create a new XMLReader.

Parameters:

directory (str) – The base directory within which to load the files

__iter__()

Return a new NavigableDictIterator as the iterator.

__weakref__

list of weak references to the object (if defined)