Mapping forest structure across the landscape
What is Biomass?
Biomass can be defined as the standing dry mass of live or dead matter from woody plants, and it is usually expressed as a mass per unit area (e.g. Megagrams per hectare: Mg ha-1). It is an essential variable used to monitor the carbon released and sequestered by forest ecosystems because approximately 50% of biomass is carbon. In this article we refer to aboveground biomass, as this is the only portion we can “see” using satellite remote sensing technology.
Measuring Forest Biomass
Measuring biomass of trees and shrubs is not easy. In fact, the only way to directly measure biomass is through destructive sampling, where trees and shrubs are cut down and weighed. This destructive approach is time-consuming, costly and counterproductive, as we want to keep the carbon in vegetation rather than releasing it back into the atmosphere (which will likely happen to those cut down trees over time). However, this method is necessary to develop allometric models. Allometries are used to estimate biomass from easy-to-measure parameters such as tree diameter and tree height. It is based on biological scaling theory and describes the dependencies of living organisms in terms of body mass, size and shape. Allometric models are used in traditional forest inventories to calculate tree biomass. Forest inventories are based on the establishment of field plots over the region of interest in order to estimate large-area forest statistics. However, many tropical countries rich in forests do not run forest inventory programs, or are only just starting to implement them, due to the remoteness of the locations and the costs involved.
When measuring forest biomass using traditional forest inventory methods, we are exposed to several sources of error such as from the manual measurement of tree dimensions, the sampling strategies, and the allometric models. Allometric estimates of biomass are usually biased (Demol et al. 2022), with differences of 15% (Burt et al, 2021) or even up to 30% (Calders et al, 2015, Gonzalez de Tanago et al, 2018). Forest inventory plots are measured over long periods of time (e.g. 5 year cycles), using different methodologies, sampling designs, sizes/shapes, and operators, which can lead to large discrepancies when assessing different projects and regions. Additionally, we lack reference data from many areas of the world’s forests due to inaccessibility and/or very high cost. This makes access to good-quality reference data one of the main challenges when monitoring biomass stocks.
Therefore we usually avoid the term ‘ground truth data’ and prefer instead the term ‘reference data’. Reference data is crucial in remote sensing and data science, as for any given model you develop, the rule-of-thumb “garbage in equals garbage out” always applies. This is why at Sylvera we put special care on our reference data.
Our attempt to build the best reference dataset: multi-scale lidar
For this purpose, we visit forests around the world, and laser scan them from the ground and air using our proprietary multi-scale lidar (MSL) methods. We collect 3D data (i.e. point clouds) on the ground using our terrestrial laser scanners (TLS). These scanners can record the structure of individual trees with millimeter-level accuracy, right down to individual twigs and leaves. We also collect similar data from our airborne laser scanners (ALS) mounted on unoccupied aerial vehicles (UAVs), which enables us to collect data over larger areas.
These novel datasets contain large amounts of information on forest structure and aboveground biomass, however, accessing this information is complex. Pulling out data on individual trees enables us to carefully reconstruct and model tree-scale parameters such as aboveground biomass. We are able to measure the biomass of trees with a margin error potentially as low as 3% (Burt et al, 2021) when compared to destructive tree measurements (vs the up to 30% error previously mentioned when using allometries). Using this MSL technology we are aiming to build the most accurate reference biomass dataset ever assembled. We are able to scan up to 50,000 ha of forest in one MSL field campaign. MSL reference data can be produced at different spatial resolutions which enables a better upscaling of the data using satellite imagery. Using MSL biomass data we are also able to create our own biomass calibration of spaceborne LiDAR footprints acquired by the Global Ecosystem Dynamics Investigation (GEDI) sensor, and enhance our reference dataset.
How do we upscale our biomass measurements to other time periods and over large areas?
Our MSL technology can measure biomass with incredible accuracy but the amount of area we can cover and the number of times we can measure is limited by the amount of time we can spend collecting data and the cost of carrying out this activity (ca. tens of thousands of hectares per field campaign). Satellite remote sensing technology is crucial to monitor biomass stocks because it allows us to do it more frequently (e.g. annually), across longer periods of time (e.g. 2000 to present), and over larger spatial scales (e.g. regional/national jurisdictions) when compared to forest inventories. Current carbon accounting standards rely on the use of satellite imagery to detect activities within carbon offsetting projects (e.g. deforestation, new planted forest), and combine these with averaged values of biomass or carbon emissions factors to determine the amount of carbon being stored by forests and the amount of carbon released by each activity. These average values are calculated at project level every 5-10 years and are based on manual field plot measurements. Unfortunately, the period between measurements is so long that a large amount of change due to forest disturbances (i.e. emissions) can be missed. More often than not, this type of work presents important sampling deficiencies (e.g. too few samples) due to cost, labor intensity and inaccessibility of some remote areas. Additionally, average values are an increasingly poor descriptor as variance increases (particularly re. aforementioned sampling deficiencies), and most forests we are interested in exhibit a lot of structural variances, which can have a considerable impact on the estimation of biomass stocks and carbon emissions.
Forest inventory plots were never designed to be used in combination with pixels from satellite observations. Data manually collected on the ground can hugely differ from remote satellite measurements in terms of spatial resolution and coverage, so discrepancies are usually introduced when trying to generate wall-to-wall remote sensing-derived products. In Sylvera, we train our models using our state-of-the-art MSL-based reference datasets and our in-house calibrated GEDI data with the best publicly available satellite imagery, which allows us to remove or minimize these discrepancies when training our models.
We upscale our biomass estimations over large areas and time scales using long-wavelength synthetic aperture radar (SAR), which can “see” through clouds and has high sensitivity to biomass, and multispectral optical satellite imagery which, despite having less sensitivity to biomass, it has a longer temporal coverage and contains other useful information related to the chlorophyll content of vegetation. We also use other types of ancillary information such as digital terrain models and spatial texture analytics.
Forests are very diverse ecological systems that display complex behavior across different temporal and spatial scales. Therefore, non-parametric machine learning algorithms, which make fewer assumptions on the shape and distribution of the reference data, often outperform parametric methods (Evans et al, 2009). Machine learning models can be used to estimate the amount and spatial distribution of biomass and its uncertainty. Using these methods we can also estimate other forest structural parameters such as canopy height or tree cover fraction.
In Sylvera we use peer-reviewed state-of-the-art approaches for monitoring aboveground biomass (Rodriguez-Veiga et al, 2020, Meyer et al, 2019, Rodriguez-Veiga et al, 2019). We also perform statistically rigorous validations and uncertainty analysis, and follow best practices (Duncanson et al, 2021, McRoberts et al, 2022). Our models are trained regionally to routinely and robustly estimate time series of forest aboveground biomass and carbon stocks from satellite data.
Our aboveground biomass time series maps are used to monitor biomass stock changes over the areas of interest
Our methods are constantly improving through the ongoing acquisition of MSL data to increase our coverage, by preparing for upcoming satellite missions (e.g. NiSAR and Biomass mission), and by the incorporation of the latest innovations from our own research and the scientific literature. Our methodologies are reviewed internally and externally by leading academics in the field. We have also collaborated with research teams from UCLA, University of Leicester, and University College London.
Why do we need to monitor biomass at Sylvera?
At Sylvera we rate carbon projects belonging to carbon frameworks such as Reducing Emissions from Deforestation and forest Degradation (REDD+). Two of the most important components of these projects are activity data and emission factors, which are then used to calculate emissions. Activity data can be evaluated using land cover classification techniques on satellite imagery, while emission factors can be evaluated using our own biomass measurements. Alternatively, we can compare emissions reported by projects with our own estimates derived from biomass time series data. These biomass time series products provide more insight into where and how much carbon is changing across project areas, and offer an opportunity to detect and evaluate carbon emissions derived from forest degradation (the second “D” in REDD+). When a forest is degraded it still exists, but it has suffered a reduction in its capacity to produce ecosystem services such as carbon storage. This is of key importance because a large proportion of carbon emissions can originate from forest degradation, which in many cases is not reported, and at the same time can be the stepping stone towards a deforestation process.
Satellite biomass monitoring will allow us to improve our ratings by evaluating the emissions from deforestation and forest degradation reported by projects, but also the emissions originated by forest degradation in projects that did not report it.
About the Research Scientists at Sylvera
The research scientists focusing on biomass at Sylvera are part of two teams: the MSL team, responsible for MSL data acquisition and processing, and the Machine Learning (ML) team, in charge of developing methods for upscaling biomass measurements to project and regional-level using ML technology.
Gabija Bernotaite is a MSL Research Software Engineer, and brings a wealth of experience working with big data and 3D datasets in various capacities, including in the world of automated cars.
Dr Robin Upham is an MSL Research Software Engineer, with experience on advanced statistical, probabilistic, and machine learning techniques, focusing his work on lidar processing for forest carbon mapping.
Abhishek Kumar is a Deep Learning Engineer on the ML team. He has more than 4 years of industry experience in building and leading state-of-the-art Deep Learning products, working with different startups bridging the gap between hopes and hard science.
Piotr Pustelnik is an Intern at the ML team. He has a Physics background with a scientific computing focus and recently obtained an MSc in Data Science from the University of Bath.
Dr Johannes Hansen is a Remote Sensing Engineer with doctoral and postdoctoral experience in Earth Observation and deforestation mapping with a focus on SAR data.
Dr Miro Demol is a MSL Lidar Scientist investigating the applications of laser scanning in forestry, with a particular interest in aboveground biomass estimation and its uncertainty.
Dr Andrew Burt is a Remote Sensing Scientist and tropical forest ecologist on the MSL team, who over the past decade has helped to pioneer the use of laser scanning in forests.
Dr Pedro Rodríguez-Veiga is a Senior Earth Observation Research Scientist on the ML team with over 12 years of experience in the field of forestry, aboveground biomass retrievals using remote sensing, and forest monitoring.
Get up to speed with "Unlocking Carbon"
Sign up to our newsletter for the latest carbon insights.