Jump to Content
AI & Machine Learning

Virtual Try On for Saree

July 30, 2024
Tushar Saxena

Director of Engineering, Meesho

Sharmila Devi

AI Program Lead, Google Cloud

Google Cloud Summit Series

Discover the latest in AI, Security, Workspace, App Dev, & more.

Register

Meesho, a leading Indian e-commerce platform is an online marketplace that facilitates trade between suppliers and customers. One of the main challenges that their end suppliers face in the cataloging process is the cost and time required for capturing models wearing different apparel products. So Meesho's team partnered with Google Cloud Consulting (GCC) to build the Virtual Try On solution for the most complex garment, saree! For a given blouse image, saree body image and pallu image provided by the end supplier, 2D images and 3D models are generated.

https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_Intro.max-1700x1700.jpg

Let's take a deeper look at the different components of the tech stack used to build the 2D and 3D-based solution.

https://storage.googleapis.com/gweb-cloudblog-publish/images/2_-_Architecture.max-2200x2200.jpg

The architecture represents the end-to-end pipeline for their Virtual Try On service utilizing the Google Cloud Platform. The pipeline aims to facilitate the process of uploading photos of sarees by end distributors, which will subsequently be processed by a custom AI solution. The processed image will be passed to Vertex AI Imagen, which will update the background and upscale the image to enhance its quality. The resulting image can be used to generate catalogs for the products. The architecture includes the following pipeline stages:

1. Requirements

Below are few image guidelines that need to be adhered:

  1. To build the solution, static images of models wearing a plain white sari without any patterns or reflections are required . It’s advised to have a diversity in the model poses - front pose, back pose, side pose etc. 

  2. Apart from the models required, the saree images uploaded by the end suppliers should be without wrinkles and the pics should be taken in proper lighting conditions. The borders of the saree (if present) should be correctly captured. 

2. Saree Reconstruction 

The idea is to convert the input images into a standardized format, so that it can be consumed in the downstream sari draping components. We used an overlay and blending method where we first overlay the two input images by a fixed; percentage and then use a python library called “img2texture” to blend the overlapped portion. This overlay and blending method results in a seamless intersection point and doesn’t show any visible inconsistencies at the intersection. This method is also good when we plan to blend two very different patterns (as in our case, the pallu and sari body image can be completely different patterns). The overlay and blending of completely different images also looks seamless and very realistic. The following image shows the inputs and outputs of sari reconstruction module:

https://storage.googleapis.com/gweb-cloudblog-publish/images/3_-_Saree_Reconstruction.max-1000x1000.jpg

3. 2D Approach 

A. Labeling

We used Label Studio to label the different segments in the saree. Below are the 4 custom segments that were identified in a front facing model: 

  • Pallu: Pallu section of the saree

  • Top Half: Upper portion of the saree, the region that wraps around the upper body.

  • Bottom Half: Lower section of the saree, capturing the area that drapes over the lower body.

  • Blouse: Designated for the blouse section

https://storage.googleapis.com/gweb-cloudblog-publish/images/4_-_2D_Approach_-_Labeling_.max-1500x1500.jpg

B. TPS Warping

TPS warping, which stands for Thin-Plate Spline warping, is a technique used to manipulate and deform images or other data points. It is a spline-based technique for data interpolation and smoothing. Label studio has been used to annotate the warping points. Below are the results of this step:

https://storage.googleapis.com/gweb-cloudblog-publish/images/5_-_TPS_Warping.max-1200x1200.jpg

C. Pleat Enhancement

The idea is to make the bottom pleats look realistic by hiding some part of patterns between two pleats. For this, we firstly label the entire bottom portion of the saree as pleat masks (png masks). Next, we put some space between ordered pleats so that we can hide some portion of the pattern image between consecutive pleats. Then, we crop the pattern accordingly which would eventually give an effect that few portions of the sari are hidden between the pleats. And finally, we apply the light mask on the image to get a realistic output. The following image shows the final output from this step:

https://storage.googleapis.com/gweb-cloudblog-publish/images/6_-_Pleats_Enhancement.max-1500x1500.jpg

As we can see in the image above, some portions of the pattern are hidden between pleats and it is making the overall draping look very realistic.

D. Light Masking

Masking is the process through which the pattern image is overlaid on the existing model. In the below image, for a given bottom half segment and a saree pattern,  the masking function modifies the model image by multiplying each pixel of the pattern with the base mask to generate the 2D image.

https://storage.googleapis.com/gweb-cloudblog-publish/images/7_-_Light_Masking.max-1300x1300.jpg

E. Compose image service will merge all the segments into the result image. 

All the segments (Blouse, Top Half, Bottom Half and Pallu sections) are merged together to form the final image.

4. 3D Approach 

A. 3D Mesh Rendering 

For the 3D approach, we need the 3D files of the model draping the saree. These files come in various formats, however for our experimentation we went ahead and used the OBJ file.

The 3D experiments are performed on Blender software, which is an easy to use 3D modeling and rendering software.The 3D pipeline primarily uses the .blend file as the source of truth and then manipulates the elements within this file to create 3D renders for each of the saree patterns.

https://storage.googleapis.com/gweb-cloudblog-publish/images/8_-_3D_Mesh_Rendering.max-1700x1700.jpg

B. 3D Mesh Scene Photography Service

In order to click realistic and showcase worthy pictures, it is necessary to determine the right camera positions and angles(orientation). We do this manually once, and take a note of all the camera positions that we deem fit, next we create a camera_profiles json that specifies the location and orientation of each of these camera positions whilst the key is set to the name of that particular configuration.

https://storage.googleapis.com/gweb-cloudblog-publish/images/9_-_Mesh_Scene_Photography.max-1300x1300.jpg

5. GenAI Magic

A. Background Enhancement 

Vertex AI Imagen is used to edit the background of the image generated. Mask-based editing lets you specify a targeted area to apply edits to. 

B. Resolution Upscaling 

Vertex AI Imagen lets you increase the size of the generated, or edited images without losing quality.

https://storage.googleapis.com/gweb-cloudblog-publish/images/10_-_GenAI_Magic.max-1200x1200.jpg

“Draping the product on a model has a huge impact on our users' ability to assess the product, perceive quality, color better and imagine how this would look on them. For our sellers, being able to automate this gives a productivity boost and a jump in likelihood of users buying the product. With this collaboration with Google Cloud, we've been able to marry the best of Google and Meesho tech to create a much better shopping experience. 

This experience is coming to a Meesho app near you soon!” according to Divyesh Shah, Vice President - Engineering at Meesho.

Fast track end-to-end deployment with Google Cloud Consulting (GCC)

The partnership between Google Cloud and Meesho is just one of the latest examples of how we’re providing AI-powered solutions to solve complex problems to help organizations drive the desired outcomes. Meesho entrusted GCC to collaborate with their teams to build the state of the workflows for their business requirements. The GCC portfolio provides a unified services capability, bringing together offerings across multiple specializations, into a single place. This includes services from learning to technical account management to professional services and customer success.

Learn more about how Google Cloud Consulting can help you learn, build, operate and succeed.

Posted in