bw2io.strategies#

Submodules#

Package Contents#

Functions#

add_activity_hash_code(data)

Add code field to characterization factors using activity_hash, if code not already present.

add_cpc_classification_from_single_reference_product(db)

add_database_name(db, name)

Add database name to datasets

assign_only_product_as_production(db)

Assign only product as reference product.

assign_single_product_as_activity(db)

change_electricity_unit_mj_to_kwh(db)

Change datasets with the string electricity in their name from units of MJ to kilowatt hour.

clean_integer_codes(data)

Convert integer activity codes to strings and delete integer codes from exchanges (they can't be believed).

convert_activity_parameters_to_list(data)

Convert activity parameters from dictionary to list of dictionaries

convert_uncertainty_types_to_integers(db)

Generic number conversion function convert to floats. Return to integers.

create_composite_code(db)

Create composite code from activity and flow names

csv_add_missing_exchanges_section(data)

csv_drop_unknown(data)

Drop keys whose values are (Unknown).

csv_numerize(data)

Turns strings into numbers where possible

csv_restore_booleans(data)

Turn True and False into proper booleans, where possible

csv_restore_tuples(data)

Restore tuples separated by :: string

delete_exchanges_missing_activity(db)

Delete exchanges that weren't linked correctly by ecoinvent.

delete_ghost_exchanges(db)

Delete technosphere which can't be linked due to ecoinvent errors.

delete_integer_codes(data)

Delete integer codes completely from extracted ecospold1 datasets

delete_none_synonyms(db)

drop_falsey_uncertainty_fields_but_keep_zeros(db)

Drop fields like '' but keep zero and NaN.

drop_temporary_outdated_biosphere_flows(db)

Drop biosphere exchanges which aren't used and are outdated

drop_unlinked(db)

This is the nuclear option - use at your own risk!

drop_unlinked_cfs(data)

Drop CFs which don't have input attribute

drop_unspecified_subcategories(db)

Drop subcategories if they are in the following:

ensure_categories_are_tuples(db)

es1_allocate_multioutput(data)

This strategy allocates multioutput datasets to new datasets.

es2_assign_only_product_with_amount_as_reference_product(db)

If a multioutput process has one product with a non-zero amount, assign that product as reference product.

fix_ecoinvent_flows_pre35(db)

fix_localized_water_flows(db)

Change Water, BR to Water.

fix_unreasonably_high_lognormal_uncertainties(db[, ...])

Fix unreasonably high uncertainty values.

fix_zero_allocation_products(db)

Drop all inputs from allocated products which had zero allocation factors.

link_biosphere_by_flow_uuid(db[, biosphere])

link_internal_technosphere_by_composite_code(db)

Link internal technosphere inputs by code.

link_technosphere_based_on_name_unit_location(db[, ...])

Link technosphere exchanges based on name, unit, and location. Can't use categories because we can't reliably extract categories from SimaPro exports, only exchanges.

link_technosphere_by_activity_hash(db[, ...])

Link technosphere exchanges using activity_hash function.

match_subcategories(data, biosphere_db_name[, remove])

Given a characterization with a top-level category, e.g. ('air',), find all biosphere flows with the same top-level categories, and add CFs for these flows as well. Doesn't replace CFs for existing flows with multi-level categories. If remove, also delete the top-level CF, but only if it is unlinked.

migrate_datasets(db, migration)

migrate_exchanges(db, migration)

normalize_biosphere_categories(db[, lcia])

Normalize biosphere categories to ecoinvent 3.1 standard

normalize_biosphere_names(db[, lcia])

Normalize biosphere flow names to ecoinvent 3.1 standard.

normalize_simapro_biosphere_categories(db)

Normalize biosphere categories to ecoinvent standard.

normalize_simapro_biosphere_names(db)

Normalize biosphere flow names to ecoinvent standard

normalize_units(db)

Normalize units in datasets and their exchanges

remove_uncertainty_from_negative_loss_exchanges(db)

Remove uncertainty from negative lognormal exchanges.

remove_unnamed_parameters(db)

Remove parameters which have no name. They can't be used in formulas or referenced.

remove_zero_amount_coproducts(db)

Remove coproducts with zero production amounts from exchanges

remove_zero_amount_inputs_with_no_activity(db)

Remove technosphere exchanges with amount of zero and no uncertainty.

set_biosphere_type(data)

Set CF types to 'biosphere', to keep compatibility with LCI strategies.

set_code_by_activity_hash(db[, overwrite])

Use activity_hash to set dataset code.

set_lognormal_loc_value(db)

Make sure loc value is correct for lognormal uncertainty distributions

sp_allocate_products(db)

Create a dataset from each product in a raw SimaPro dataset

split_exchanges(data, filter_params, changed_attributes)

Split unlinked exchanges in data which satisfy filter_params into new exchanges with changed attributes.

split_simapro_name_geo(db)

Split a name like 'foo/CH U' into name and geo components.

strip_biosphere_exc_locations(db)

Biosphere flows don't have locations - if any are included they can confuse linking

tupleize_categories(db)

update_ecoinvent_locations(db)

Update old ecoinvent location codes

Attributes#

link_iterable_by_fields

bw2io.strategies.add_activity_hash_code(data)[source]#

Add code field to characterization factors using activity_hash, if code not already present.

bw2io.strategies.add_cpc_classification_from_single_reference_product(db)[source]#
bw2io.strategies.add_database_name(db, name)[source]#

Add database name to datasets

bw2io.strategies.assign_only_product_as_production(db)[source]#

Assign only product as reference product.

Skips datasets that already have a reference product or no production exchanges. Production exchanges must have a name and an amount.

Will replace the following activity fields, if not already specified:

  • ‘name’ - name of reference product

  • ‘unit’ - unit of reference product

  • ‘production amount’ - amount of reference product

bw2io.strategies.assign_single_product_as_activity(db)[source]#
bw2io.strategies.change_electricity_unit_mj_to_kwh(db)[source]#

Change datasets with the string electricity in their name from units of MJ to kilowatt hour.

bw2io.strategies.clean_integer_codes(data)[source]#

Convert integer activity codes to strings and delete integer codes from exchanges (they can’t be believed).

bw2io.strategies.convert_activity_parameters_to_list(data)[source]#

Convert activity parameters from dictionary to list of dictionaries

bw2io.strategies.convert_uncertainty_types_to_integers(db)[source]#

Generic number conversion function convert to floats. Return to integers.

bw2io.strategies.create_composite_code(db)[source]#

Create composite code from activity and flow names

bw2io.strategies.csv_add_missing_exchanges_section(data)[source]#
bw2io.strategies.csv_drop_unknown(data)[source]#

Drop keys whose values are (Unknown).

bw2io.strategies.csv_numerize(data)[source]#

Turns strings into numbers where possible

bw2io.strategies.csv_restore_booleans(data)[source]#

Turn True and False into proper booleans, where possible

bw2io.strategies.csv_restore_tuples(data)[source]#

Restore tuples separated by :: string

bw2io.strategies.delete_exchanges_missing_activity(db)[source]#

Delete exchanges that weren’t linked correctly by ecoinvent.

These exchanges are missing the “activityLinkId” attribute, and the flow they want to consume is not produced as the reference product of any activity. See the known data issues report.

bw2io.strategies.delete_ghost_exchanges(db)[source]#

Delete technosphere which can’t be linked due to ecoinvent errors.

A ghost exchange is one which links to a combination of activity and flow which aren’t provided in the database.

bw2io.strategies.delete_integer_codes(data)[source]#

Delete integer codes completely from extracted ecospold1 datasets

bw2io.strategies.delete_none_synonyms(db)[source]#
bw2io.strategies.drop_falsey_uncertainty_fields_but_keep_zeros(db)[source]#

Drop fields like ‘’ but keep zero and NaN.

Note that this doesn’t strip False, which behaves exactly like 0.

bw2io.strategies.drop_temporary_outdated_biosphere_flows(db)[source]#

Drop biosphere exchanges which aren’t used and are outdated

bw2io.strategies.drop_unlinked(db)[source]#

This is the nuclear option - use at your own risk!

bw2io.strategies.drop_unlinked_cfs(data)[source]#

Drop CFs which don’t have input attribute

bw2io.strategies.drop_unspecified_subcategories(db)[source]#

Drop subcategories if they are in the following: * unspecified * (unspecified) * '' (empty string) * None

bw2io.strategies.ensure_categories_are_tuples(db)[source]#
bw2io.strategies.es1_allocate_multioutput(data)[source]#

This strategy allocates multioutput datasets to new datasets.

This deletes the multioutput dataset, breaking any existing linking. This shouldn’t be a concern, as you shouldn’t link to a multioutput dataset in any case.

Note that multiple allocations for the same product and input will result in undefined behavior.

bw2io.strategies.es2_assign_only_product_with_amount_as_reference_product(db)[source]#

If a multioutput process has one product with a non-zero amount, assign that product as reference product.

This is by default called after remove_zero_amount_coproducts, which will delete the zero-amount coproducts in any case. However, we still keep the zero-amount logic in case people want to keep all coproducts.

bw2io.strategies.fix_ecoinvent_flows_pre35(db)[source]#
bw2io.strategies.fix_localized_water_flows(db)[source]#

Change Water, BR to Water.

Biosphere flows can’t have locations - locations are defined by the activity dataset.

bw2io.strategies.fix_unreasonably_high_lognormal_uncertainties(db, cutoff=2.5, replacement=0.25)[source]#

Fix unreasonably high uncertainty values.

With the default cutoff value of 2.5 and a median of 1, the 95% confidence interval has a high to low ratio of 20.000.

bw2io.strategies.fix_zero_allocation_products(db)[source]#

Drop all inputs from allocated products which had zero allocation factors.

The final production amount is the initial amount times the allocation factor. If this is zero, a singular technosphere matrix is created. We fix this by setting the production amount to one, and deleting all inputs.

Does not modify datasets with more than one production exchange.

Link internal technosphere inputs by code.

Only links to process datasets actually in the database document.

Link technosphere exchanges based on name, unit, and location. Can’t use categories because we can’t reliably extract categories from SimaPro exports, only exchanges.

If external_db_name, link against a different database; otherwise link internally.

Link technosphere exchanges using activity_hash function.

If external_db_name, link against a different database; otherwise link internally.

If fields, link using only certain fields.

bw2io.strategies.match_subcategories(data, biosphere_db_name, remove=True)[source]#

Given a characterization with a top-level category, e.g. ('air',), find all biosphere flows with the same top-level categories, and add CFs for these flows as well. Doesn’t replace CFs for existing flows with multi-level categories. If remove, also delete the top-level CF, but only if it is unlinked.

bw2io.strategies.migrate_datasets(db, migration)[source]#
bw2io.strategies.migrate_exchanges(db, migration)[source]#
bw2io.strategies.normalize_biosphere_categories(db, lcia=False)[source]#

Normalize biosphere categories to ecoinvent 3.1 standard

bw2io.strategies.normalize_biosphere_names(db, lcia=False)[source]#

Normalize biosphere flow names to ecoinvent 3.1 standard.

Assumes that each dataset and each exchange have a name. Will change names even if exchange is already linked.

bw2io.strategies.normalize_simapro_biosphere_categories(db)[source]#

Normalize biosphere categories to ecoinvent standard.

bw2io.strategies.normalize_simapro_biosphere_names(db)[source]#

Normalize biosphere flow names to ecoinvent standard

bw2io.strategies.normalize_units(db)[source]#

Normalize units in datasets and their exchanges

bw2io.strategies.remove_uncertainty_from_negative_loss_exchanges(db)[source]#

Remove uncertainty from negative lognormal exchanges.

There are 15699 of these in ecoinvent 3.3 cutoff.

The basic uncertainty and pedigree matrix are applied rather blindly, and the can produce strange net production values. It makes much more sense to assume that these loss factors are static.

Only applies to exchanges which decrease net production.

bw2io.strategies.remove_unnamed_parameters(db)[source]#

Remove parameters which have no name. They can’t be used in formulas or referenced.

bw2io.strategies.remove_zero_amount_coproducts(db)[source]#

Remove coproducts with zero production amounts from exchanges

bw2io.strategies.remove_zero_amount_inputs_with_no_activity(db)[source]#

Remove technosphere exchanges with amount of zero and no uncertainty.

Input exchanges with zero amounts are the result of the ecoinvent linking algorithm, and can be safely discarded.

bw2io.strategies.set_biosphere_type(data)[source]#

Set CF types to ‘biosphere’, to keep compatibility with LCI strategies.

This will overwrite existing type values.

bw2io.strategies.set_code_by_activity_hash(db, overwrite=False)[source]#

Use activity_hash to set dataset code.

By default, won’t overwrite existing codes, but will if overwrite is True.

bw2io.strategies.set_lognormal_loc_value(db)[source]#

Make sure loc value is correct for lognormal uncertainty distributions

bw2io.strategies.sp_allocate_products(db)[source]#

Create a dataset from each product in a raw SimaPro dataset

bw2io.strategies.split_exchanges(data, filter_params, changed_attributes, allocation_factors=None)[source]#

Split unlinked exchanges in data which satisfy filter_params into new exchanges with changed attributes.

changed_attributes is a list of dictionaries with the attributes that should be changed.

allocation_factors is an optional list of floats to allocate the original exchange amount to the respective copies defined in changed_attributes. They don’t have to sum to one. If allocation_factors are not defined, then exchanges are split equally.

Resets uncertainty to UndefinedUncertainty (0).

To use this function as a strategy, you will need to curry it first using functools.partial.

Example usage:

split_exchanges(
    [
        {'exchanges': [{
            'name': 'foo',
            'location': 'bar',
            'amount': 20
        }, {
            'name': 'food',
            'location': 'bar',
            'amount': 12
        }]}
    ],
    {'name': 'foo'},
    [{'location': 'A'}, {'location': 'B', 'cat': 'dog'}
]
>>> [
    {'exchanges': [{
        'name': 'food',
        'location': 'bar',
        'amount': 12
    }, {
        'name': 'foo',
        'location': 'A',
        'amount': 12.,
        'uncertainty_type': 0
    }, {
        'name': 'foo',
        'location': 'B',
        'amount': 8.,
        'uncertainty_type': 0,
        'cat': 'dog',
    }]}
]
bw2io.strategies.split_simapro_name_geo(db)[source]#

Split a name like ‘foo/CH U’ into name and geo components.

Sets original name to simapro name.

bw2io.strategies.strip_biosphere_exc_locations(db)[source]#

Biosphere flows don’t have locations - if any are included they can confuse linking

bw2io.strategies.tupleize_categories(db)[source]#
bw2io.strategies.update_ecoinvent_locations(db)[source]#

Update old ecoinvent location codes