bw2data.backends#

Subpackages#

Submodules#

Package Contents#

Classes#

JSONDatabase

A data store for LCI databases. Stores each dataset in a separate file, serialized to JSON.

LCIBackend

A base class for LCI backends.

SQLiteBackend

A base class for LCI backends.

SingleFileDatabase

A data store for LCI databases where each database is stored as a pickle file.

Functions#

convert_backend(database_name, backend)

Convert a Database to another backend.

class bw2data.backends.JSONDatabase(name)#

Bases: bw2data.backends.base.LCIBackend

A data store for LCI databases. Stores each dataset in a separate file, serialized to JSON.

Instead of loading all the data at once, .load() creates a SynchronousJSONDict, which loads values on demand.

Use this backend by setting "backend":"json" in the database metadata. This is done automatically if you call .register() from this class.

backend = 'json'#
filepath_intermediate()#
get(code)#

Get Activity proxy for this dataset

load(as_dict=False, *args, **kwargs)#

Instantiate SynchronousJSONDict for this database.

register(**kwargs)#

Register a database with the metadata store, using the correct value for backend, and creates database directory.

write(data, process=True)#

Serialize data to disk. Most of the time, this data has already been saved to disk, so this is a no-op. The only exception is if data is a new database dictionary.

Normalizes units when found.

Parameters

data (*) – Inventory data

class bw2data.backends.LCIBackend(name)[source]#

Bases: bw2data.data_store.ProcessedDataStore

Inheritance diagram of bw2data.backends.LCIBackend

A base class for LCI backends.

Subclasses must support at least the following calls:

  • load()

  • write(data)

In addition, they should specify their backend with the backend attribute (a unicode string).

LCIBackend provides the following, which should not need to be modified:

  • rename

  • copy

  • find_dependents

  • random

  • process

For new classes to be recognized by the DatabaseChooser, they need to be registered with the config object, e.g.:

config.backends['backend type string'] = BackendClass

Instantiation does not load any data. If this database is not yet registered in the metadata store, a warning is written to stdout.

The data schema for databases in voluptuous is:

exchange = {
        Required("input"): valid_tuple,
        Required("type"): basestring,
        }
exchange.update(uncertainty_dict)
lci_dataset = {
    Optional("categories"): Any(list, tuple),
    Optional("location"): object,
    Optional("unit"): basestring,
    Optional("name"): basestring,
    Optional("type"): basestring,
    Optional("exchanges"): [exchange]
}
db_validator = Schema({valid_tuple: lci_dataset}, extra=True)
where:
  • valid_tuple is a dataset identifier, like ("ecoinvent", "super strong steel")

  • uncertainty_fields are fields from an uncertainty dictionary.

Processing a Database actually produces two parameter arrays: one for the exchanges, which make up the technosphere and biosphere matrices, and a geomapping array which links activities to locations.

Parameters

*name* (unicode string) – Name of the database to manage.

property filename#

Remove filesystem-unsafe characters and perform unicode normalization on self.name using utils.safe_filename().

_metadata#
dtype_fields = [(), (), (), (), ()]#
dtype_fields_geomapping = [(), (), (), ()]#
validator#
copy(name)#

Make a copy of the database.

Internal links within the database will be updated to match the new database name, i.e. ("old name", "some id") will be converted to ("new name", "some id") for all exchanges.

Parameters

name (*) – Name of the new database. Must not already exist.

delete(**kwargs)#

Delete data from this instance. For the base class, only clears cached data.

filepath_geomapping()#
abstract filepath_intermediate()#
find_dependents(data=None, ignore=None)#

Get sorted list of direct dependent databases (databases linked from exchanges).

Parameters
  • data (*) – Inventory data

  • ignore (*) – List of database names to ignore

Returns

List of database names

find_graph_dependents()#

Recursively get list of all dependent databases.

Returns

A set of database names

abstract load(*args, **kwargs)#

Load the intermediate data for this database.

If load() does not return a dictionary, then the returned object must have at least the following dictionary-like methods:

  • __iter__

  • __contains__

  • __getitem__

  • __setitem__

  • __delitem__

  • __len__

  • keys()

  • values()

  • items()

  • items()

However, this method must support the keyword argument as_dict, and .load(as_dict=True) must return a normal dictionary with all Database data. This is necessary for JSON serialization.

It is recommended to subclass collections.{abc.}MutableMapping (see SynchronousJSONDict for an example of data loaded on demand).

process(*args, **kwargs)#

Process inventory documents.

Creates both a parameter array for exchanges, and a geomapping parameter array linking inventory activities to locations.

If the uncertainty type is no uncertainty, undefined, or not specified, then the ‘amount’ value is used for ‘loc’ as well. This is needed for the random number generator.

Parameters

version (*) – The version of the database to process

Doesn’t return anything, but writes two files to disk.

query(*queries)#

Search through the database.

random()#

Return a random activity key.

Returns a random activity key, or None (and issues a warning) if the current database is empty.

register(**kwargs)#

Register a database with the metadata store.

Databases must be registered before data can be written.

Writing data automatically sets the following metadata:
  • depends: Names of the databases that this database references, e.g. “biosphere”

  • number: Number of processes in this database.

Parameters

format (*) – Format that the database was converted from, e.g. “Ecospold”

relabel_data(data, new_name)#

Relabel database keys and exchanges.

In a database which internally refer to the same database, update to new database name new_name.

Needed to copy a database completely or cut out a section of a database.

For example:

data = {
    ("old and boring", 1):
        {"exchanges": [
            {"input": ("old and boring", 42),
            "amount": 1.0},
            ]
        },
    ("old and boring", 2):
        {"exchanges": [
            {"input": ("old and boring", 1),
            "amount": 4.0}
            ]
        }
    }
print(relabel_database(data, "shiny new"))
>> {
    ("shiny new", 1):
        {"exchanges": [
            {"input": ("old and boring", 42),
            "amount": 1.0},
            ]
        },
    ("shiny new", 2):
        {"exchanges": [
            {"input": ("shiny new", 1),
            "amount": 4.0}
            ]
        }
    }

In the example, the exchange to ("old and boring", 42) does not change, as this is not part of the updated data.

Parameters
  • data (*) – The data to modify

  • new_name (*) – The name of the modified database

Returns

The modified data

rename(name)#

Rename a database. Modifies exchanges to link to new name. Deregisters old database.

Parameters

name (*) – New name.

Returns

New Database object.

abstract write(data)#

Serialize data to disk.

data must be a dictionary of the form:

{
    ('database name', 'dataset code'): {dataset}
}
class bw2data.backends.SQLiteBackend(*args, **kwargs)#

Bases: bw2data.backends.base.LCIBackend

A base class for LCI backends.

Subclasses must support at least the following calls:

  • load()

  • write(data)

In addition, they should specify their backend with the backend attribute (a unicode string).

LCIBackend provides the following, which should not need to be modified:

  • rename

  • copy

  • find_dependents

  • random

  • process

For new classes to be recognized by the DatabaseChooser, they need to be registered with the config object, e.g.:

config.backends['backend type string'] = BackendClass

Instantiation does not load any data. If this database is not yet registered in the metadata store, a warning is written to stdout.

The data schema for databases in voluptuous is:

exchange = {
        Required("input"): valid_tuple,
        Required("type"): basestring,
        }
exchange.update(uncertainty_dict)
lci_dataset = {
    Optional("categories"): Any(list, tuple),
    Optional("location"): object,
    Optional("unit"): basestring,
    Optional("name"): basestring,
    Optional("type"): basestring,
    Optional("exchanges"): [exchange]
}
db_validator = Schema({valid_tuple: lci_dataset}, extra=True)
where:
  • valid_tuple is a dataset identifier, like ("ecoinvent", "super strong steel")

  • uncertainty_fields are fields from an uncertainty dictionary.

Processing a Database actually produces two parameter arrays: one for the exchanges, which make up the technosphere and biosphere matrices, and a geomapping array which links activities to locations.

Parameters

*name* (unicode string) – Name of the database to manage.

property _searchable#
backend = 'sqlite'#
filters#
order_by#
_add_indices()#
_drop_indices()#
_efficient_write_dataset(index, key, ds, exchanges, activities)#
_efficient_write_many_data(data, indices=True)#
_get_filters()#
_get_order_by()#
_get_queryset(random=False, filters=True)#
_set_filters(filters)#
_set_order_by(field)#
delete(keep_params=False, warn=True)#

Delete all data from SQLite database and Whoosh index

get(code)#
graph_technosphere(filename=None, **kwargs)#
load(*args, **kwargs)#

Load the intermediate data for this database.

If load() does not return a dictionary, then the returned object must have at least the following dictionary-like methods:

  • __iter__

  • __contains__

  • __getitem__

  • __setitem__

  • __delitem__

  • __len__

  • keys()

  • values()

  • items()

  • items()

However, this method must support the keyword argument as_dict, and .load(as_dict=True) must return a normal dictionary with all Database data. This is necessary for JSON serialization.

It is recommended to subclass collections.{abc.}MutableMapping (see SynchronousJSONDict for an example of data loaded on demand).

make_searchable(reset=False)#
make_unsearchable()#
new_activity(code, **kwargs)#
process()#

Process inventory documents to NumPy structured arrays.

Use a raw SQLite3 cursor instead of Peewee for a ~2 times speed advantage.

random(filters=True, true_random=False)#

True random requires loading and sorting data in SQLite, and can be resource-intensive.

search(string, **kwargs)#

Search this database for string.

The searcher include the following fields:

  • name

  • comment

  • categories

  • location

  • reference product

string can include wild cards, e.g. "trans*".

By default, the name field is given the most weight. The full weighting set is called the boost dictionary, and the default weights are:

{
    "name": 5,
    "comment": 1,
    "product": 3,
    "categories": 2,
    "location": 3
}

Optional keyword arguments:

  • limit: Number of results to return.

  • boosts: Dictionary of field names and numeric boosts - see default boost values above. New values must be in the same format, but with different weights.

  • filter: Dictionary of criteria that search results must meet, e.g. {'categories': 'air'}. Keys must be one of the above fields.

  • mask: Dictionary of criteria that exclude search results. Same format as filter.

  • facet: Field to facet results. Must be one of name, product, categories, location, or database.

  • proxy: Return Activity proxies instead of raw Whoosh documents. Default is True.

Returns a list of Activity datasets.

write(data, process=True)#

Write data to database.

data must be a dictionary of the form:

{
    ('database name', 'dataset code'): {dataset}
}

Writing a database will first deletes all existing data.

class bw2data.backends.SingleFileDatabase(name)#

Bases: bw2data.backends.base.LCIBackend

A data store for LCI databases where each database is stored as a pickle file.

Databases are automatically versioned. See below for reversion, etc. methods

Parameters

*name* (str) – Name of the database to manage.

property filename#

Remove filesystem-unsafe characters and perform unicode normalization on self.name using utils.safe_filename().

property version#

The current version number (integer) of this database.

Returns

Version number

backend = 'singlefile'#
validator#
filename_for_version(version=None)#

Filename for given version; Default is current version.

Returns

Filename (not path)

filepath_intermediate(version=None)#
get(code)#

Get Activity proxy for this dataset

load(version=None, **kwargs)#

Load the intermediate data for this database.

Can also load previous versions of this database’s intermediate data.

Parameters

version (*) – Version of the database to load. Default version is the latest version.

Returns

The intermediate data, a dictionary.

make_latest_version()#

Make the current version the latest version.

Requires loading data because a new intermediate data file is created.

register(**kwargs)#

Register a database with the metadata store.

Databases must be registered before data can be written.

revert(version)#

Return data to a previous state.

Warning

Reverting can lead to data loss, e.g. if you revert from version 3 to version 1, and then save your database, you will overwrite version 2. Use make_latest_version() before saving, which will set the current version to 4.

Parameters

version (*) – Number of the version to revert to.

versions()#

Get a list of available versions of this database.

Returns

List of (version, datetime created) tuples.

write(data, process=True)#

Serialize data to disk.

Parameters

data (*) – Inventory data

bw2data.backends.convert_backend(database_name, backend)[source]#

Convert a Database to another backend.

bw2data currently supports the default and json backends.

Parameters
  • database_name (*) – Name of database.

  • backend (*) – Type of database. backend should be recoginized by DatabaseChooser.

Returns False if the old and new backend are the same. Otherwise returns an instance of the new Database object.