Multimodal Data Integration for Sustainable Indoor Gardening: Tracking Anyplant with Time Series Foundation Model

Multimodal Data Integration for Sustainable Indoor Gardening: Tracking Anyplant with Time Series Foundation ModelFeatured Image

This paper introduces a multimodal framework integrating computer vision, environmental sensors, and Lag-Llama time-series models for automated plant health monitoring in sustainable indoor gardening.

Key Findings

  • Multimodal integration (RGB imagery, phenotypic ratios like area-to-height, environmental data such as temperature/humidity/VOC) boosted water stress prediction accuracy in basil plants grown in a Vivosun Smart Grow Tent.
  • Zero-shot Lag-Llama outperformed fine-tuning, achieving lowest errors with full multimodal data: CRPS 0.00109, MSE 0.009680, MAE 0.096282.
  • Tracking Anyplant model (built on SAM and XMem) effectively extracted features like RGB values, plant dimensions from webcam images, enabling scalable CEA applications.

Future Directions

  • Expand to diverse plant species, varied environments, and refine fine-tuning for minute-level temporal data mismatches.
  • Integrate with Building Management Systems (BMS) for energy-efficient urban agriculture and automated irrigation.
  • Pre-train Lag-Llama on high-granularity datasets to surpass zero-shot performance; add sensors for broader phenotyping.