Interpretability of machine learning models
21 January 2021
The development of machine learning models that process large amounts of data greatly improves the performance of predictions.
Originally designed for military use and scientific research… satellites are now more and more coveted by Silicon Valley: but for what purpose?
Originally designed for military use and scientific research, satellites have become essential to the functioning of the information economy and their number is multiplying around our planet. The Union of Concerned Scientists estimates the number of artificial satellites in service in orbit is more than 2,000 in 2019, including 700 launched since 2017.
A figure that should increase with the arrival in the space sector of GAFAs and companies from Silicon Valley, such as SpaceX by Elon Musk or Blue Origin by Jeff Bezos, developing reusable launchers to reduce the unit launch cost and to launch all types of satellites, not only observation satellites. Miniaturization (visible for example on Earth in smartphones) has made it possible to build nanosatellites, which further reduces the weight of the launch in the cost price of a satellite. These satellites are capable of collecting several terabytes of data which are then transmitted and received by ground stations to study various phenomena.
What data do these observation satellites collect? How and by whom are these data used? Far from limited to the military or governmental realms, nowadays the data from Earth observation satellites are more and more sought out and exploited by private players in agriculture, insurance, finance or even in the oil industry, thanks to the considerable technological progress made in the aerospace sector and in data processing.
Technological advances, such as the development of launchers, the miniaturization of satellites and the improvement of optical techniques, are at the origin of the proliferation of satellites.
Satellite observation data can take different forms, the main ones being: firstly, satellite imagery (in the form of photographs), which is mainly used to map areas; and secondly, data from radar satellites, which, by illuminating the Earth’s surface with an electromagnetic signal that is reflected and measured, are able to record a multitude of data (object detection, movement, altimetry, etc.) day and night, whatever the weather conditions.
The growing interest in satellite observation data is reinforced, on the one hand, by the miniaturization of satellites making this technology more accessible, with a cost (construction and launch) going from several tens, even hundreds, of millions of euros for a standard satellite to a few hundred thousand euros for a nanosatellite. These various technological advances have made it possible to collect a larger volume of satellite data of better quality.
This cost is explained in particular by the high launch cost due, in part, to the mass of the satellite, itself linked to the size and weight of the measuring instruments, the type of engine used and its energy source (batteries, solar panels, etc.). Lighter, it can be put into orbit by a cheap launcher, or by ride-sharing with other satellites, thus further reducing this cost. Satellite manufacturers (Airbus, Maxar, Thales Alenia Space, ArianeGroup, etc.) have relied on technological advances in engines, as well as the development of electric propulsion, which is more efficient in reducing total mass. At the same time, advances in electronics have made it possible to manufacture increasingly energy-efficient chips, reducing the weight of satellites.
All these advances have led to the construction of nanosatellites, designated as such when they weigh less than 10kg (a so-called “mini” satellite weighs between 100 and 500kg). Their reduced size makes it possible to launch a larger number of nanosatellites with a single launcher, while collecting as much data, all at a lower cost. This is particularly true for imaging radars, but SAR (Synthetic Aperture Radar) technology, which can capture two- and three-dimensional images, is in its infancy.
Finally, spatial resolution, which refers to the size of the smallest visible pixel in images captured by a satellite, was around 100 meters just a few years ago, whereas today it has dropped to less than one meter. This progress can be explained by the improved quality of the cameras installed on the satellites. This allows, for example, to distinguish precise details on a house, giving rise to many applications, particularly in the field of insurance (see below). However, too high a resolution can be problematic in terms of data ownership (privacy, image rights, etc.) and, moreover, is not always necessary to exploit these images. This is why the demand today is mainly established around a resolution of about thirty centimeters.
These various technological advances have made it possible to collect a larger volume of satellite data of better quality. But the exploitation of these data first requires to process them and make them usable by companies, which is made possible by the development of the cloud and Data Science applied to the space domain.
First of all, the rise of the cloud has facilitated the large-scale storage of huge volumes of data collected by satellites. Added to this are advances in data processing using Artificial Intelligence, which has made it possible to automate data cleansing and reduce the time spent interpreting data.
As an illustration, in 2019 Amazon Web Services launched AWS Ground Station, a network of ground stations located around the world to control satellites, receive their data, store it in their cloud system and analyze it. The technology giant has formed partnerships with satellite operators, such as Maxar Technologies and Thales Alenia Space, to give them access to its services, facilitating data access for their customers. These are the first “Ground Stations-as-a-Service”: a service model giving customers on-demand access to these satellites and a set of data analysis functionalities on a pay-per-use model, without having to set up heavy infrastructures, the main obstacles to the widespread use of satellite data.
The computing power of the Cloud then enables multiple data sources to be cross-referenced, enabling a particular phenomenon to be surrounded: aerial images, radar images, infrared, ultraviolet, etc., which considerably enriches Machine Learning models. For example, the mapping of a forest by satellite can be carried out using satellite imagery, which enables the surface and type of vegetation observed to be detected, to which radar imagery can be added to distinguish depth, providing a more complete and accurate view of the area.
• The mining and petroleum sectors make extensive use of satellite data for the detection and identification of oil slicks. For example, the study of the location of a new drilling area can be carried out using satellite imagery and radar, or SAR. Indeed, this technology makes it possible to “see” under the ground and thus to judge the suitability of the soil for mining activity, which is impossible with simple satellite imagery. However, these SAR technologies are very expensive and therefore not very reproducible on a large scale.
• It is also possible to measure fuel inventories in a given area using geospatial data. This is precisely what the American startup Orbital Insight did in 2016 using radar images of oil tanks stored in China, whose fill levels they calculated from the shadows cast on the ground (see image below). This led to an estimate of 600 million barrels of oil reserves stored in China at the time, 50% more than the estimates made by market experts. Such an analysis could also be provided to an energy producer to estimate the activity of its competitors or to traders speculating on the price of oil, as Kayrros does.
On the other hand, trading companies and hedge funds find in satellite data a way to obtain weak signals on the economic performance of companies well before they communicate on the financial markets. A case in point is the use of satellite images capturing the parking lots of large retail chains and allowing them to detect parked cars and then count them using a Computer Vision model.
Finally, another area of interest in Earth observation is insurance, where satellite data is being used to automate tasks such as property inspection and fraud detection. For example, the American startup Cape Analytics has a database of satellite images of 70 million homes, updated several times a year, from which it uses Computer Vision models to provide analyses for insurance companies: detection of additions made to a property that have not been declared, or inspection of the roof condition of properties, which until now have required human intervention. Their risk prediction models are thus enriched with accurate data and acquired more quickly and at a lower cost.
This ability to translate raw satellite data into concrete, actionable results at a reasonable cost has expanded the scope of geospatial data applications. This is why data pure-players have been positioning themselves for a few years now on the processing and analysis of satellite data to provide decision support tools, aware of the major opportunity that Earth observation represents.
The application possibilities of geospatial data are finally very broad and will be extended with technologies such as UAVs or the Internet of Things (IoT) allowing the collection of new data with which to cross-reference satellite data. Above all, it is the emergence of tech giants in the space industry – SpaceTech – building reusable rockets capable of launching satellites into orbit and returning to Earth, which will make it possible to send more satellites into orbit at a decreasing marginal cost, paving the way for other future applications. Added to this is the continuing exponential drop in the cost of data and its processing, accelerating the development of Earth observation through ever more varied use cases for companies.
However, the multiplication of satellites will probably raise environmental and space congestion issues that will have to be taken into account by the players concerned…
Sarah GAUVARD and Maxime CARO
Sur le même sujet