bw2analyzer
#
Submodules#
Package Contents#
Classes#
Manipulate |
|
Functions#
|
Compare activities by the impact of their different inputs, aggregated by the product classification of those inputs. |
|
Compare selected activities to see if they are substantially different. |
|
Given an |
|
Traverse a supply chain graph, and calculate the LCA scores of each component. Prints the result with the format: |
|
Traverse a supply chain graph, and prints the inputs of each component. |
|
Traverse a functional unit throughout its foreground database(s) or the |
- class bw2analyzer.ContributionAnalysis[source]#
- annotate(sorted_data, rev_mapping)#
Reverse the mapping from database ids to array indices
- annotated_top_emissions(lca, names=True, **kwargs)#
Get list of most damaging biosphere flows in an LCA, sorted by
abs(direct impact)
.Returns a list of tuples:
(lca score, inventory amount, activity)
. Ifnames
is False, they returns the process key as the last element.
- annotated_top_processes(lca, names=True, **kwargs)#
Get list of most damaging processes in an LCA, sorted by
abs(direct impact)
.Returns a list of tuples:
(lca score, supply, activity)
. Ifnames
is False, they returns the process key as the last element.
- d3_treemap(matrix, rev_bio, rev_techno, limit=0.025, limit_type='percent')#
Construct treemap input data structure for LCA result. Output like:
{ "name": "LCA result", "children": [{ "name": process 1, "children": [ {"name": emission 1, "size": score}, {"name": emission 2, "size": score}, ], }] }
- get_name(key)#
- hinton_matrix(lca, rows=5, cols=5)#
- sort_array(data, limit=25, limit_type='number', total=None)#
Common sorting function for all
top
methods. Sorts by highest value first.Operates in either
number
orpercent
mode. Innumber
mode, returnlimit
values. Inpercent
mode, return all values >= (total * limit); where0 < limit <= 1
.Returns 2-d numpy array of sorted values and row indices, e.g.:
ContributionAnalysis().sort_array((1., 3., 2.))
returns
( (3, 1), (2, 2), (1, 0) )
- Parameters
data (*) – A 1-d array of values to sort.
limit (*) – Number of values to return, or percentage cutoff.
limit_type (*) – Either
number
orpercent
.total (*) – Optional specification of summed data total.
- Returns
2-d numpy array of values and row indices.
- top_emissions(matrix, **kwargs)#
Return an array of [value, index] biosphere emissions.
- top_matrix(matrix, rows=5, cols=5)#
Find most important (i.e. highest summed) rows and columns in a matrix, as well as the most corresponding non-zero individual elements in the top rows and columns.
Only returns matrix values which are in the top rows and columns. Element values are returned as a tuple:
(row, col, row index in top rows, col index in top cols, value)
.Example:
matrix = [ [0, 0, 1, 0], [2, 0, 4, 0], [3, 0, 1, 1], [0, 7, 0, 1], ]
In this matrix, the row sums are
(1, 6, 5, 8)
, and the columns sums are(5, 7, 6, 2)
. Therefore, the top rows are(3, 1)
and the top columns are(1, 2)
. The result would therefore be:( ( (3, 1, 0, 0, 7), (3, 2, 0, 1, 1), (1, 2, 1, 1, 4) ), (3, 1), (1, 2) )
- Parameters
matrix (*) – Any Python object that supports the
.sum(axis=)
syntax.rows (*) – Number of rows to select.
cols (*) – Number of columns to select.
- Returns
(elements, top rows, top columns)
- top_processes(matrix, **kwargs)#
Return an array of [value, index] technosphere processes.
- class bw2analyzer.DatabaseHealthCheck(database)[source]#
- aggregated_processes(cutoff=500)#
- check(graphs_dir=None)#
- make_graphs(graphs_dir=None)#
- multioutput_processes()#
- no_self_production()#
- page_rank()#
- uncertainty_check()#
- unique_exchanges()#
- class bw2analyzer.GTManipulator[source]#
Manipulate
GraphTraversal
results.- static add_metadata(nodes, lca)#
Add metadata to nodes, like name, unit, and category.
- static d3_force_directed(nodes, edges, score)#
Reformat to D3 style, which is a list of nodes, and edge ids are node list indices.
- static d3_treemap(nodes, edges, lca, add_biosphere=False)#
Add node data by traversing the graph; assign different metadata to leaf nodes.
- static simplify(nodes, edges, score, limit=0.005)#
Simplify supply chain to include only nodes which individually contribute
limit * score
.Only removes and combines edges; doesn’t check to make sure amounts add up correctly.
- static simplify_naive(nodes, edges, score, limit=0.0025)#
Naive simplification which simplifies removes links below an LCA score cutoff. Orphan nodes are also deleted.
- static unroll_graph(nodes, edges, score, cutoff=0.005, max_links=2500)#
Unroll a
GraphTraversal
result, allowing the same activity to appear in the graph multiple times.
- class bw2analyzer.PageRank(database)[source]#
- calculate()#
- page_rank(technosphere, alpha=0.85, max_iter=100, tol=1e-06)#
Return the PageRank of the nodes in the graph.
Adapted from http://networkx.lanl.gov/svn/networkx/trunk/networkx/algorithms/link_analysis/pagerank_alg.py
PageRank computes a ranking of the nodes in the graph G based on the structure of the incoming links. It was originally designed as an algorithm to rank web pages.
The eigenvector calculation uses power iteration with a SciPy sparse matrix representation.
- Parameters
technosphere (*) – The technosphere matrix.
alpha (*) – Damping parameter for PageRank, default=0.85
- Returns
Dictionary of nodes (activity codes) with value as PageRank
References
- 1
A. Langville and C. Meyer, “A survey of eigenvector methods of web information retrieval.” http://citeseer.ist.psu.edu/713792.html
- 2
Page, Lawrence; Brin, Sergey; Motwani, Rajeev and Winograd, Terry, The PageRank citation ranking: Bringing order to the Web. 1999 http://dbpubs.stanford.edu:8090/pub/showDoc.Fulltext?lang=en&doc=1999-66&format=pdf
- bw2analyzer.compare_activities_by_grouped_leaves(activities, lcia_method, mode='relative', max_level=4, cutoff=0.0075, output_format='list', str_length=50)[source]#
Compare activities by the impact of their different inputs, aggregated by the product classification of those inputs.
- Parameters
activities – list of
Activity
instances.lcia_method – tuple. LCIA method to use when traversing supply chain graph.
mode – str. If “relative” (default), results are returned as a fraction of total input. Otherwise, results are absolute impact per input exchange.
max_level – int. Maximum level in supply chain to examine.
cutoff – float. Fraction of total impact to cutoff supply chain graph traversal at.
output_format – str. See below.
html (str_length; int. If output_format is) –
have. (this controls how many characters each column label can) –
- Raises
ValueError –
activities
is malformed.- Returns
list
: Tuple of(column labels, data)
html
: HTML string that will print nicely in Jupyter notebooks.pandas
: a pandasDataFrame
.
- Return type
Depends on
output_format
- bw2analyzer.compare_activities_by_lcia_score(activities, lcia_method, band=0.1)[source]#
Compare selected activities to see if they are substantially different.
Substantially different means that all LCIA scores lie within a band of
band * max_lcia_score
.Inputs:
activities
: List ofActivity
objects.lcia_method
: Tuple identifying aMethod
- Returns
Nothing, but prints to stdout.
- bw2analyzer.find_differences_in_inputs(activity, rel_tol=0.0001, abs_tol=1e-09, locations=None, as_dataframe=False)[source]#
Given an
Activity
, try to see if other activities in the same database (with the same name and reference product) have the same input levels.Tolerance values are inputs to math.isclose.
If differences are present, a difference dictionary is constructed, with the form:
{Activity instance: [(name of input flow (str), amount)]}
Note that this doesn’t reference a specific exchange, but rather sums all exchanges with the same input reference product.
Assumes that all similar activities produce the same amount of reference product.
(x, y)
, wherex
is the number of similar activities, andy
is a dictionary of the differences. This dictionary is empty if no differences are found.- Parameters
activity –
Activity
. Activity to analyze.rel_tol – float. Relative tolerance to decide if two inputs are the same. See above.
abs_tol – float. Absolute tolerance to decide if two inputs are the same. See above.
locations – list, optional. Locations to restrict comparison to, if present.
as_dataframe – bool. Return results as pandas DataFrame.
- Returns
dict or
pandas.DataFrame
.
- bw2analyzer.print_recursive_calculation(activity, lcia_method, amount=1, max_level=3, cutoff=0.01, string_length=130, file_obj=None, tab_character=' ', use_matrix_values=False, _lca_obj=None, _total_score=None, __level=0, __first=True)[source]#
Traverse a supply chain graph, and calculate the LCA scores of each component. Prints the result with the format:
{tab_character * level }{fraction of total score} ({absolute LCA score for this input} | {amount of input}) {input activity}
- Parameters
activity –
Activity
. The starting point of the supply chain graph.lcia_method – tuple. LCIA method to use when traversing supply chain graph.
amount – int. Amount of
activity
to assess.max_level – int. Maximum depth to traverse.
cutoff – float. Fraction of total score to use as cutoff when deciding whether to traverse deeper.
string_length – int. Maximum length of printed string.
file_obj – File-like object (supports
.write
), optional. Output will be written to this object if provided.tab_character – str. Character to use to indicate indentation.
use_matrix_values – bool. Take exchange values from the matrix instead of the exchange instance
amount
. Useful for Monte Carlo, but can be incorrect if there is more than one exchange from the same pair of nodes.
- Normally internal args:
_lca_obj:
LCA
. Can give an instance of the LCA class (e.g. when doing regionalized or Monte Carlo LCA) _total_score: float. Needed if specifying_lca_obj
.- Internal args (used during recursion, do not touch);
__level: int. __first: bool.
- Returns
Nothing. Prints to
sys.stdout
orfile_obj
- bw2analyzer.print_recursive_supply_chain(activity, amount=1, max_level=2, cutoff=0, string_length=130, file_obj=None, tab_character=' ', __level=0)[source]#
Traverse a supply chain graph, and prints the inputs of each component.
This function is only for exploration; use
bw2calc.GraphTraversal
for a better performing function.The results displayed here can also be incorrect if
- Parameters
activity –
Activity
. The starting point of the supply chain graph.amount – int. Supply chain inputs will be scaled to this value.
max_level – int. Max depth to search for.
cutoff – float. Inputs with amounts less than
amount * cutoff
will not be printed or traversed further.string_length – int. Maximum length of each line.
file_obj – File-like object (supports
.write
), optional. Output will be written to this object if provided.tab_character – str. Character to use to indicate indentation.
__level – int. Current level of the calculation. Only used internally, do not touch.
- Returns
Nothing. Prints to
stdout
orfile_obj
- bw2analyzer.traverse_tagged_databases(functional_unit, method, label='tag', default_tag='other', secondary_tags=[], fg_databases=None)[source]#
Traverse a functional unit throughout its foreground database(s) or the listed databses in fg_databses, and group impacts by tag label.
Contribution analysis work by linking impacts to individual activities. However, you also might want to group impacts in other ways. For example, give individual biosphere exchanges their own grouping, or aggregate two activities together.
Consider this example system, where the letters are the tag labels, and the numbers are exchange amounts. The functional unit is one unit of the tree root.
In this supply chain, tags are applied to activities and biosphere exchanges. If a biosphere exchange is not tagged, it inherits the tag of its producing activity. Similarly, links to other databases are assessed with the usual LCA machinery, and the total LCA score is tagged according to its consuming activity. If an activity does not have a tag, a default tag is applied.
We can change our visualization to show the use of the default tags:
And then we can manually calculate the tagged impacts. Normally we would need to know the actual biosphere flows and their respective characterization factors (CF), but in this example we assume that each CF is one. Our result, group by tags, would therefore be:
A: \(6 + 27 = 33\)
B: \(30 + 44 = 74\)
C: \(5 + 16 + 48 = 69\)
D: \(14\)
This function will only traverse the foreground database, i.e. the database of the functional unit activity. A functional unit can have multiple starting nodes; in this case, all foreground databases are traversed.
Input arguments:
functional_unit
: A functional unit dictionary, e.g.{("foo", "bar"): 42}
.method
: A method name, e.g.("foo", "bar")
label
: The label of the tag classifier. Default is"tag"
default_tag
: The tag classifier to use if none was given. Default is"other"
secondary_tags
: List of tuples in the format (secondary_label, secondary_default_tag). Default is empty list.fg_databases
: a list of foreground databases to be traversed, e.g. [‘foreground’, ‘biomass’, ‘machinery’]It’s not recommended to include all databases of a project in the list to be traversed, especially not ecoinvent itself
- Returns
Aggregated tags dictionary from
aggregate_tagged_graph
, and tagged supply chain graph fromrecurse_tagged_database
.