Deep Learning Enables a New View in the Agricultural Industry

5 min readMay 4, 2022

Although initially a slow adopter of machine learning and computer vision, agriculture has become an important domain for these approaches. Adoption and extension of these approaches are critical due to the challenges facing global agriculture: the world’s population is predicted to reach 9.7 billion by 2050, water supply is expected to fall 40% short of global needs by 2030, and climate change produces significant challenges and uncertainty.

Advances in deep learning and remote sensing technologies have unlocked unprecedented opportunities for precision agriculture. Computer vision is now a key element of agricultural systems to determine crop type, count plants, guide harvesting robots, identify issues like crop stress and weeds, and forecast potential yield. Furthermore, advances in remote sensing have provided unprecedented amounts of data at increasingly high resolution.

In the figure above, we see a cornfield in early June, shortly after emergence, taken with 3m resolution from satellite (top) and 10cm resolution from fixed-wing aircraft (bottom). While both sources give an overview of the field’s development, the high-resolution data that we collect at Intelinair provides new views and insights not accessible from lower resolution imagery. At this resolution, individual rows are identifiable and problem issues can be seen in a precise fashion that would otherwise require manual scouting. Furthermore, multi-spectral imagery enables the creation of vegetative indices like NDVI (far right) which provide additional insight into where decreased biomass, low vigor, stress, or weeds may be present.

Satellite and fixed-wing aircraft enable the collection of data at a massive scale. In 2021, we collected more than 500 TB of high-resolution imagery data and supplemented this with additional low-resolution public satellite sources. This massive amount of data makes aerial agricultural imagery a prime candidate for deep learning approaches, which are known to be data-hungry.

However, challenges remain in leveraging common SOTA approaches with remote sensing data, as these approaches are often developed for natural scene imagery like Imagenet. Remote sensing data may be massive, spatiotemporal, multispectral, contain very small objects, and possess fundamentally different statistics than natural scene images. Leveraging the right techniques to adapt and extend methods into the remote sensing domain is critical for delivering the best models and insights to our users.

Aerial Agricultural Data: The Same, But Different

At first glance, imagery captured via remote sensing is not much different from your “normal” image you would find on the internet or in a dataset like Imagenet. They all can be represented as matrices which can then be fed to your favorite downstream computer vision model. However, there are additional challenges when using these images:

Geospatial: To get the data into matrix form, geospatial data requires defining a coordination system and appropriately projecting the data (i.e., think about turning a globe into a flat map).

Massive: The amount of data that exists is massive; it is truly big data. Understanding how to process, train, and inference such data requires understanding the right scientific and engineering methodologies to train and inference such huge amounts of data in an efficient manner.

Individually Large: With our high-resolution data, each field can be upwards of 15,000 x 15,000 pixels in dimension and over 1GB in size. This requires experimentation and decisions to be made around windowing, downsampling, and multiscale modeling.

Multispectral: Channels beyond RGB provide information that is invisible to the human eye. In traditional ML approaches, additional handcrafted features must be constructed from these channels based on domain knowledge. In deep learning, the model can learn important features directly from the data, but its performance can be impacted by the inductive biases we build into the architecture.

Multi-sensor: Beyond imagery, other sources of information such as thermal and synthetic aperture radar exist. These must be fused into models appropriately, often accounting for differences in resolution and even time/day of collection between different sources.

Spatiotemporal: Only further enlarging the data we want to push through our models, most geospatial data is spatiotemporal, being collected many times throughout the year, over many years. And unlike in video, the temporal cadence of image collection may be variable. We collect imagery of fields 13 times a growing season (8 shown below) to monitor the crop throughout the season.

Color Importance: Color means something more in agriculture than it does in other datasets. The exact channel values correspond to underlying biological processes like photosynthetic activity. This impacts how we think about input normalization as well as layers such as batch and instance norm.

(Especially) Hard to Label: Collecting labels for agricultural aerial imagery is particularly onerous. While annotation is a challenge for all machine learning tasks, it can be particularly challenging for remote sensing agricultural data. For example,

Entitles of interest are very small (often only a few pixels in size),
Numerous (potentially tens of thousands in a full-field image),
Crowded (e.g., many weeds become a weed cluster); and,
Organic (creating complex boundaries).

Below is an example where weeds, crops, soil, and unfarmed areas are annotated. Such a task requires detailed annotation work. Therefore, leveraging techniques such as weakly supervised methods and active learning is paramount.

Limited Labels: Following from the above points, the difficulty of annotation and massive amounts of data means most of the data collected is not annotated (imagine annotating trillions of weeds!). Therefore, we leverage advances in unsupervised and self-supervised techniques based on contrastive learning to make the most of our raw data.

Tools in our Toolbox

To address some of the challenges discussed above, we use techniques in spatiotemporal modeling, multi-task learning, and contrastive learning. In this session, the following topics will be explored:

How a Convolutional LSTM can be used to not only detect but predict areas of nutrient deficiency stress.
How incorporating Multi-task Learning can enable fine-grained segmentation.
How Contrastive Learning can be used to extract value from our massive amounts of raw data.

The architectural design of the networks used for these different tasks as well as the exciting results which can deliver to our farmers and agricultural retailers also will be examined.

Original post here.

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Subscribe to our fast-growing Medium Publication too, the ODSC Journal, and inquire about becoming a writer.

Deep Learning Enables a New View in the Agricultural Industry

Aerial Agricultural Data: The Same, But Different

Tools in our Toolbox

Written by ODSC - Open Data Science

Responses (2)