Siddhartha Venkatayogi

My implementation of TIGER, from the 2023 paper Recommender Systems with Generative Retrieval (Rajput et al.), trained and evaluated on Amazon Beauty.

Github!

Metric	Mine	Paper
Recall@5	0.0312	0.0454
NDCG@5	0.0210	0.0321
Recall@10	0.0486	0.0648
NDCG@10	0.0265	0.0384

Invalid-ID rate @10 ≈ 0.0006. Best checkpoint at step 20K (val NDCG@10 = 0.0377).

NDCG (Normalized Discounted Cumulative Gain) measures ranking quality, rewarding the correct item appearing higher in the top-K list. Hits are discounted by their rank and normalized so a perfect ranking scores 1.0.

Future Work

Improve hyperparameters or training behavior to match or improve on original paper’s reported metrics. Current test metrics ended up at ~70% of paper reported. Pretty sure that this is because of sub optimal RQVAE training.
Add implementations for PLUM and STATIC

References

Recommender Systems with Generative Retrieval — Rajput et al., NeurIPS 2023.
Autoregressive Image Generation using Residual Quantization — Lee et al. (RQ-VAE).