Combining Millions of Products Into One Marketplace Using Computer Vision and NLP

  • How do we deal with the possibility that the data provided may be insufficient for machine learning?
  • How do we deal with new categories, where we do not yet have enough training data?
  • Should you use separate, simpler models for overall classification vs specific attributes, or a larger monolith that handles everything at once?
  • How do you best combine predictions from images and from text, given that some product attributes are better learned from one or the other?
  • How can we monitor model performance over time?

Classification with images and text

Tagging auxiliary attributes



