bw2io.strategies
#
Submodules#
bw2io.strategies.biosphere
bw2io.strategies.csv
bw2io.strategies.ecospold1
bw2io.strategies.ecospold1_allocation
bw2io.strategies.ecospold2
bw2io.strategies.exiobase
bw2io.strategies.generic
bw2io.strategies.json_ld
bw2io.strategies.json_ld_allocation
bw2io.strategies.json_ld_lcia
bw2io.strategies.lcia
bw2io.strategies.locations
bw2io.strategies.migrations
bw2io.strategies.simapro
bw2io.strategies.special
bw2io.strategies.useeio
Package Contents#
Functions#
|
Add |
|
Add database name to datasets |
Assign only product as reference product. |
|
Change datasets with the string |
|
|
Convert integer activity codes to strings and delete integer codes from exchanges. |
Convert activity parameters from dictionary to list of dictionaries |
|
Generic number conversion function convert to floats. Return to integers. |
|
Create composite code from activity and flow names |
|
Add an empty exchanges section to any dictionary in data that doesn't already have one. |
|
|
Remove any keys whose values are (Unknown). |
|
Convert string values to float or int where possible |
|
Convert boolean-like strings to booleans where possible. |
|
Convert tuple-like strings to actual tuples. |
Delete exchanges that weren't linked correctly by ecoinvent. |
|
Delete technosphere which can't be linked due to ecoinvent errors. |
|
|
Delete integer codes from the input data dictionary. |
Drop fields like '' but keep zero and NaN. |
|
Drop biosphere exchanges which aren't used and are outdated |
|
|
This is the nuclear option - use at your own risk! |
|
Drop CFs which don't have |
Drop subcategories if they are in the following: |
|
Convert dataset categories to tuples in the given database, if they are not already tuples. |
|
|
This strategy allocates multioutput datasets to new datasets. |
|
If a multioutput process has one product with a non-zero amount, assign that product as reference product. |
Change |
|
Fix unreasonably high uncertainty values. |
|
Drop all inputs from allocated products which had zero allocation factors. |
|
Add units to activities from their reference products. |
|
|
|
|
Perform allocation on multifunctional datasets. |
Convert the units to their reference unit. Also changes the format to eliminate unnecessary complexity. |
|
Return list of processes from raw data. |
|
The exchanges location strings are not necessarily the same as those given in the process or the master metadata. Fix this inconsistency. |
|
The exchanges unit strings are not necessarily the same as BW units. Fix this inconsistency. |
|
Change metadata field names from the JSON-LD processes to BW schema. |
|
|
|
Link internal technosphere inputs by |
|
|
Generic function to link objects in |
Link technosphere exchanges based on name, unit, and location. Can't use categories because we can't reliably extract categories from SimaPro exports, only exchanges. |
|
|
Link technosphere exchanges using |
|
Given a characterization with a top-level category, e.g. |
|
|
|
|
|
Normalize biosphere categories to ecoinvent 3.1 standard in the given database. |
|
Normalize biosphere flow names to ecoinvent 3.1 standard in the given database. |
Normalize biosphere categories to ecoinvent standard. |
|
Normalize biosphere flow names to ecoinvent standard |
|
|
Normalize units in datasets and their exchanges |
|
Remove most inputs to make the US EEIO have a structure more like other LCA databases |
Remove uncertainty from negative lognormal exchanges. |
|
Remove parameters which have no name. They can't be used in formulas or referenced. |
|
|
Remove products from US EEIO and collapse to only activities |
Remove coproducts with zero production amounts from |
|
Remove technosphere exchanges with amount of zero and no uncertainty. |
|
|
Set CF types to 'biosphere', to keep compatibility with LCI strategies. |
|
Use |
Make sure |
|
Create a dataset from each product in a raw SimaPro dataset |
|
|
Split unlinked exchanges in |
Split a name like 'foo/CH U' into name and geo components. |
|
Remove locations from biosphere exchanges in the given database, as biosphere exchanges are not geographically specific. |
|
Update old ecoinvent location codes |
|
The consequential system model automatically generates new biosphere flows with the category |
- bw2io.strategies.add_activity_hash_code(data)[source]#
Add
code
field to characterization factors usingactivity_hash
, ifcode
not already present.
- bw2io.strategies.assign_only_product_as_production(db)[source]#
Assign only product as reference product.
Skips datasets that already have a reference product or no production exchanges. Production exchanges must have a
name
and an amount.Will replace the following activity fields, if not already specified:
‘name’ - name of reference product
‘unit’ - unit of reference product
‘production amount’ - amount of reference product
- bw2io.strategies.change_electricity_unit_mj_to_kwh(db)[source]#
Change datasets with the string
electricity
in their name from units of MJ to kilowatt hour.
- bw2io.strategies.clean_integer_codes(data)[source]#
Convert integer activity codes to strings and delete integer codes from exchanges.
- Parameters
data (list of dict) – List of datasets, where each dataset is a dictionary containing information about the dataset, such as its name, description, and exchanges.
- Returns
The cleaned list of datasets, where integer activity codes have been converted to strings and integer codes have been deleted from exchanges.
- Return type
list of dict
Examples
>>> data = [{'name': 'Dataset A', 'description': '...', 'code': 123}, ... {'name': 'Dataset B', 'description': '...', 'exchanges': [{'code': 456, 'amount': 1.0}]}] >>> clean_integer_codes(data) [{'name': 'Dataset A', 'description': '...', 'code': '123'}, {'name': 'Dataset B', 'description': '...', 'exchanges': [{'amount': 1.0}]}]
- bw2io.strategies.convert_activity_parameters_to_list(data)[source]#
Convert activity parameters from dictionary to list of dictionaries
- bw2io.strategies.convert_uncertainty_types_to_integers(db)[source]#
Generic number conversion function convert to floats. Return to integers.
- bw2io.strategies.create_composite_code(db)[source]#
Create composite code from activity and flow names
- bw2io.strategies.csv_add_missing_exchanges_section(data)[source]#
Add an empty exchanges section to any dictionary in data that doesn’t already have one.
- Parameters
data (list of dict) – A list of dictionaries, where each dictionary represents a row of data.
- Returns
The updated list of dictionaries with an empty exchanges section added to any dictionary that doesn’t already have one.
- Return type
list[dict]
Examples
>>> data = [ {"name": "John", "age": 30}, {"name": "Alice", "age": 25, "exchanges": []}, {"name": "Bob", "age": 40, "exchanges": [{"name": "NYSE"}]} ] >>> csv_add_missing_exchanges_section(data) [ {"name": "John", "age": 30, "exchanges": []}, {"name": "Alice", "age": 25, "exchanges": []}, {"name": "Bob", "age": 40, "exchanges": [{"name": "NYSE"}]} ]
- bw2io.strategies.csv_drop_unknown(data)[source]#
Remove any keys whose values are (Unknown).
- Parameters
data (list[dict]) – A list of dictionaries, where each dictionary represents a row of data.
- Returns
The updated list of dictionaries with (Unknown) values removed from the keys.
- Return type
list[dict]
Examples
>>> data = [ {"name": "John", "age": 30, "gender": "(Unknown)"}, {"name": "Alice", "age": 25, "gender": "Female"}, {"name": "Bob", "age": 40, "gender": "Male"} ] >>> csv_drop_unknown(data) [ {"name": "Alice", "age": 25, "gender": "Female"}, {"name": "Bob", "age": 40, "gender": "Male"} ]
- bw2io.strategies.csv_numerize(data)[source]#
Convert string values to float or int where possible
- Parameters
data (list of dict) – A list of datasets.
- Returns
A list of datasets with string values converted to float or int where possible.
- Return type
list of dict
Examples
>>> data = [{'amount': '10.0'}, {'exchanges': [{'amount': '20', 'uncertainty type': 'undefined'}]}] >>> csv_numerize(data) [{'amount': 10.0}, {'exchanges': [{'amount': 20, 'uncertainty type': 'undefined'}]}]
- bw2io.strategies.csv_restore_booleans(data)[source]#
Convert boolean-like strings to booleans where possible.
- Parameters
data (list of dict) – A list of datasets.
- Returns
A list of datasets with booleans restored.
- Return type
list of dict
Examples
>>> data = [{'categories': 'category1', 'is_animal': 'true'}, {'exchanges': [{'categories': 'category2', 'amount': '10.0', 'uncertainty type': 'undefined', 'is_biomass': 'False'}]}] >>> csv_restore_booleans(data) [{'categories': 'category1', 'is_animal': True}, {'exchanges': [{'categories': 'category2', 'amount': '10.0', 'uncertainty type': 'undefined', 'is_biomass': False}]}]
- bw2io.strategies.csv_restore_tuples(data)[source]#
Convert tuple-like strings to actual tuples.
- Parameters
data (list of dict) – A list of datasets.
- Returns
A list of datasets with tuples restored from string.
- Return type
list of dict
Examples
>>> data = [{'categories': 'category1::category2'}, {'exchanges': [{'categories': 'category3::category4', 'amount': '10.0'}]}] >>> csv_restore_tuples(data) [{'categories': ('category1', 'category2')}, {'exchanges': [{'categories': ('category3', 'category4'), 'amount': '10.0'}]}]
- bw2io.strategies.delete_exchanges_missing_activity(db)[source]#
Delete exchanges that weren’t linked correctly by ecoinvent.
These exchanges are missing the “activityLinkId” attribute, and the flow they want to consume is not produced as the reference product of any activity. See the known data issues report.
- bw2io.strategies.delete_ghost_exchanges(db)[source]#
Delete technosphere which can’t be linked due to ecoinvent errors.
A ghost exchange is one which links to a combination of activity and flow which aren’t provided in the database.
- bw2io.strategies.delete_integer_codes(data)[source]#
Delete integer codes from the input data dictionary.
- Parameters
data (list[dict]) – A list of dictionaries, where each dictionary represents a row of data. Each dictionary should have a name key, and optionally a code and or exchanges key.
- Returns
The updated list of dictionaries with any integer code keys removed from the dictionaries and their exchanges keys
- Return type
list[dict]
Examples
>>> data = [{'name': 'test', 'code': 1}, {'name': 'test2', 'exchanges': [{'code': 2}]}] >>> delete_integer_codes(data) >>> data == [{'name': 'test'}, {'name': 'test2', 'exchanges': [{}]}]
- bw2io.strategies.drop_falsey_uncertainty_fields_but_keep_zeros(db)[source]#
Drop fields like ‘’ but keep zero and NaN.
Note that this doesn’t strip False, which behaves exactly like 0.
- bw2io.strategies.drop_temporary_outdated_biosphere_flows(db)[source]#
Drop biosphere exchanges which aren’t used and are outdated
- bw2io.strategies.drop_unspecified_subcategories(db)[source]#
Drop subcategories if they are in the following: *
unspecified
*(unspecified)
*''
(empty string) *None
- Parameters
db (list) – A list of datasets, each containing exchanges.
- Returns
A modified list of datasets with unspecified subcategories removed.
- Return type
list
Examples
>>> db = [{"categories": ["A", "unspecified"]}, {"exchanges": [{"categories": ["B", ""]}]}, {"categories": ["C", None]}] >>> new_db = drop_unspecified_subcategories(db) >>> new_db [{"categories": ["A"]}, {"exchanges": [{"categories": ["B"]}]}, {"categories": ["C"]}]
- bw2io.strategies.ensure_categories_are_tuples(db)[source]#
Convert dataset categories to tuples in the given database, if they are not already tuples.
- Parameters
db (list) – A list of datasets, each containing exchanges.
- Return type
A modified list of datasets with categories as tuples.
Examples
>>> db = [{"categories": ["A", "B"]}, {"categories": ("C", "D")}] >>> new_db = ensure_categories_are_tuples(db) >>> new_db [{"categories": ("A", "B")}, {"categories": ("C", "D")}]
- bw2io.strategies.es1_allocate_multioutput(data)[source]#
This strategy allocates multioutput datasets to new datasets.
This deletes the multioutput dataset, breaking any existing linking. This shouldn’t be a concern, as you shouldn’t link to a multioutput dataset in any case.
Note that multiple allocations for the same product and input will result in undefined behavior.
- Parameters
data (list of dict) – List of datasets, where each dataset is a dictionary containing information about the dataset, such as its name, description, and exchanges.
- Returns
The new list of datasets, where multioutput datasets have been allocated to new datasets.
- Return type
list of dict
Examples
>>> data = [{'name': 'Dataset A', 'exchanges': [{'name': 'Output 1', 'amount': 1.0}, ... {'name': 'Output 2', 'amount': 2.0}], ... 'allocations': [{'name': 'Activity 1', 'product': 'Output 1', 'input': 'Input 1'}, ... {'name': 'Activity 2', 'product': 'Output 2', 'input': 'Input 2'}]}, ... {'name': 'Dataset B', 'exchanges': [{'name': 'Output 1', 'amount': 1.0}], ... 'allocations': [{'name': 'Activity 3', 'product': 'Output 1', 'input': 'Input 3'}]}] >>> es1_allocate_multioutput(data) [{'name': 'Dataset A: Output 1', 'exchanges': [{'name': 'Output 1', 'amount': 1.0}], 'allocations': [{'name': 'Activity 1', 'product': 'Output 1', 'input': 'Input 1'}]}, {'name': 'Dataset A: Output 2', 'exchanges': [{'name': 'Output 2', 'amount': 2.0}], 'allocations': [{'name': 'Activity 2', 'product': 'Output 2', 'input': 'Input 2'}]}, {'name': 'Dataset B', 'exchanges': [{'name': 'Output 1', 'amount': 1.0}], 'allocations': [{'name': 'Activity 3', 'product': 'Output 1', 'input': 'Input 3'}]}]
- bw2io.strategies.es2_assign_only_product_with_amount_as_reference_product(db)[source]#
If a multioutput process has one product with a non-zero amount, assign that product as reference product.
This is by default called after
remove_zero_amount_coproducts
, which will delete the zero-amount coproducts in any case. However, we still keep the zero-amount logic in case people want to keep all coproducts.
- bw2io.strategies.fix_localized_water_flows(db)[source]#
Change
Water, BR
toWater
.Biosphere flows can’t have locations - locations are defined by the activity dataset.
- bw2io.strategies.fix_unreasonably_high_lognormal_uncertainties(db, cutoff=2.5, replacement=0.25)[source]#
Fix unreasonably high uncertainty values.
With the default cutoff value of 2.5 and a median of 1, the 95% confidence interval has a high to low ratio of 20.000.
- bw2io.strategies.fix_zero_allocation_products(db)[source]#
Drop all inputs from allocated products which had zero allocation factors.
The final production amount is the initial amount times the allocation factor. If this is zero, a singular technosphere matrix is created. We fix this by setting the production amount to one, and deleting all inputs.
Does not modify datasets with more than one production exchange.
- bw2io.strategies.json_ld_add_activity_unit(db)[source]#
Add units to activities from their reference products.
- bw2io.strategies.json_ld_allocate_datasets(db, preferred_allocation=None)[source]#
Perform allocation on multifunctional datasets.
Uses the
preferred_allocation
method if available; otherwise, the default method.Here are the allocation methods listed in the JSON-LD spec:
PHYSICAL_ALLOCATION
ECONOMIC_ALLOCATION
CAUSAL_ALLOCATION (Can be exchange-specific)
USE_DEFAULT_ALLOCATION
NO_ALLOCATION
We can’t use
@id
values as codes after allocation, so we combine the process id and the flow id for the allocated dataset.
- bw2io.strategies.json_ld_convert_unit_to_reference_unit(db)[source]#
Convert the units to their reference unit. Also changes the format to eliminate unnecessary complexity.
Changes:
To:
- {
‘flow’: {…}, ‘unit’: ‘MJ’
}
- bw2io.strategies.json_ld_get_activities_list_from_rawdata(data)[source]#
Return list of processes from raw data.
- bw2io.strategies.json_ld_get_normalized_exchange_locations(data)[source]#
The exchanges location strings are not necessarily the same as those given in the process or the master metadata. Fix this inconsistency.
This has to happen before we transform the input data from a dictionary to a list of activities, as it uses the
locations
data.
- bw2io.strategies.json_ld_get_normalized_exchange_units(data)[source]#
The exchanges unit strings are not necessarily the same as BW units. Fix this inconsistency.
- bw2io.strategies.json_ld_rename_metadata_fields(db)[source]#
Change metadata field names from the JSON-LD processes to BW schema.
BW schema: https://wurst.readthedocs.io/#internal-data-format
- bw2io.strategies.link_internal_technosphere_by_composite_code(db)[source]#
Link internal technosphere inputs by
code
.Only links to process datasets actually in the database document.
- bw2io.strategies.link_iterable_by_fields(unlinked, other=None, fields=None, kind=None, internal=False, relink=False)[source]#
Generic function to link objects in
unlinked
to objects inother
using fieldsfields
.The database to be linked must have uniqueness for each object for the given
fields
.If
kind
, limit the exchanges inunlinked
objects to types inkind
.If
relink
, link to objects which already have aninput
. Otherwise, skip already linked objects.If
internal
, linkedunlinked
to other objects inunlinked
. Each object must have the attributesdatabase
andcode
.
- bw2io.strategies.link_technosphere_based_on_name_unit_location(db, external_db_name=None)[source]#
Link technosphere exchanges based on name, unit, and location. Can’t use categories because we can’t reliably extract categories from SimaPro exports, only exchanges.
If
external_db_name
, link against a different database; otherwise link internally.
- bw2io.strategies.link_technosphere_by_activity_hash(db, external_db_name=None, fields=None)[source]#
Link technosphere exchanges using
activity_hash
function.If
external_db_name
, link against a different database; otherwise link internally.If
fields
, link using only certain fields.
- bw2io.strategies.match_subcategories(data, biosphere_db_name, remove=True)[source]#
Given a characterization with a top-level category, e.g.
('air',)
, find all biosphere flows with the same top-level categories, and add CFs for these flows as well. Doesn’t replace CFs for existing flows with multi-level categories. Ifremove
, also delete the top-level CF, but only if it is unlinked.
- bw2io.strategies.normalize_biosphere_categories(db, lcia=False)[source]#
Normalize biosphere categories to ecoinvent 3.1 standard in the given database.
- Parameters
db (list) – A list of datasets, each containing exchanges.
lcia (bool, optional) – If True, only normalize biosphere categories in LCIA datasets. Defaults to False.
- Returns
A modified list of datasets with normalized biosphere categories.
- Return type
list
Examples
>>> db = [{"categories": ["old_biosphere_category"]}] >>> new_db = normalize_biosphere_categories(db) >>> new_db [{"categories": ["new_biosphere_category"]}]
- bw2io.strategies.normalize_biosphere_names(db, lcia=False)[source]#
Normalize biosphere flow names to ecoinvent 3.1 standard in the given database.
Assumes that each dataset and each exchange have a
name
. Will change names even if exchange is already linked.- Parameters
db (list) – A list of datasets, each containing exchanges.
lcia (bool, optional) – If True, only normalize biosphere flow names in LCIA datasets. Default is False.
- Returns
A modified list of datasets with normalized biosphere flow names.
- Return type
list
Examples
>>> db = [{"name": "old_biosphere_name"}] >>> new_db = normalize_biosphere_names(db) >>> new_db [{"name": "new_biosphere_name"}]
- bw2io.strategies.normalize_simapro_biosphere_categories(db)[source]#
Normalize biosphere categories to ecoinvent standard.
- bw2io.strategies.normalize_simapro_biosphere_names(db)[source]#
Normalize biosphere flow names to ecoinvent standard
- bw2io.strategies.remove_random_exchanges(data, fraction=0.9)[source]#
Remove most inputs to make the US EEIO have a structure more like other LCA databases
- bw2io.strategies.remove_uncertainty_from_negative_loss_exchanges(db)[source]#
Remove uncertainty from negative lognormal exchanges.
There are 15699 of these in ecoinvent 3.3 cutoff.
The basic uncertainty and pedigree matrix are applied rather blindly, and the can produce strange net production values. It makes much more sense to assume that these loss factors are static.
Only applies to exchanges which decrease net production.
- bw2io.strategies.remove_unnamed_parameters(db)[source]#
Remove parameters which have no name. They can’t be used in formulas or referenced.
- bw2io.strategies.remove_useeio_products(data)[source]#
Remove products from US EEIO and collapse to only activities
- bw2io.strategies.remove_zero_amount_coproducts(db)[source]#
Remove coproducts with zero production amounts from
exchanges
- bw2io.strategies.remove_zero_amount_inputs_with_no_activity(db)[source]#
Remove technosphere exchanges with amount of zero and no uncertainty.
Input exchanges with zero amounts are the result of the ecoinvent linking algorithm, and can be safely discarded.
- bw2io.strategies.set_biosphere_type(data)[source]#
Set CF types to ‘biosphere’, to keep compatibility with LCI strategies.
This will overwrite existing
type
values.
- bw2io.strategies.set_code_by_activity_hash(db, overwrite=False)[source]#
Use
activity_hash
to set dataset code.By default, won’t overwrite existing codes, but will if
overwrite
isTrue
.
- bw2io.strategies.set_lognormal_loc_value(db)[source]#
Make sure
loc
value is correct for lognormal uncertainty distributions
- bw2io.strategies.sp_allocate_products(db)[source]#
Create a dataset from each product in a raw SimaPro dataset
- bw2io.strategies.split_exchanges(data, filter_params, changed_attributes, allocation_factors=None)[source]#
Split unlinked exchanges in
data
which satisfyfilter_params
into new exchanges with changed attributes.changed_attributes
is a list of dictionaries with the attributes that should be changed.allocation_factors
is an optional list of floats to allocate the original exchange amount to the respective copies defined inchanged_attributes
. They don’t have to sum to one. Ifallocation_factors
are not defined, then exchanges are split equally.Resets uncertainty to
UndefinedUncertainty
(0).To use this function as a strategy, you will need to curry it first using
functools.partial
.Example usage:
split_exchanges( [ {'exchanges': [{ 'name': 'foo', 'location': 'bar', 'amount': 20 }, { 'name': 'food', 'location': 'bar', 'amount': 12 }]} ], {'name': 'foo'}, [{'location': 'A'}, {'location': 'B', 'cat': 'dog'} ] >>> [ {'exchanges': [{ 'name': 'food', 'location': 'bar', 'amount': 12 }, { 'name': 'foo', 'location': 'A', 'amount': 12., 'uncertainty_type': 0 }, { 'name': 'foo', 'location': 'B', 'amount': 8., 'uncertainty_type': 0, 'cat': 'dog', }]} ]
- bw2io.strategies.split_simapro_name_geo(db)[source]#
Split a name like ‘foo/CH U’ into name and geo components.
Sets original name to
simapro name
.
- bw2io.strategies.strip_biosphere_exc_locations(db)[source]#
Remove locations from biosphere exchanges in the given database, as biosphere exchanges are not geographically specific.
- Parameters
db (list) – A list of datasets, each containing exchanges.
- Returns
A modified list of datasets with locations removed from biosphere exchanges.
- Return type
list
Examples
>>> db = [{"exchanges": [{"type": "biosphere", "location": "GLO"}]}] >>> new_db = strip_biosphere_exc_locations(db) >>> new_db [{"exchanges": [{"type": "biosphere"}]}]
- bw2io.strategies.update_social_flows_in_older_consequential(db, biosphere_db)[source]#
The consequential system model automatically generates new biosphere flows with the category
social
(even though they aren’t social flows) which are not really used and definitely not characterized, and whose UUID seems to change with each release. They are:residual wood, dry
venting of argon, crude, liquid
venting of nitrogen, liquid
The ecoinvent centre recommends that they be dropped:
Consequential system model issues Three elementary exchanges are found in the compartment “social”. These exchanges can be ignored, both at the unit process and the inventory level, as ecoinvent does not yet account for social impacts.
However, we can just look up the new UUIDs.