A skip prediction system for the music streaming platform Lyra that applies generative retrieval concepts to session modeling. Tracks are encoded as hierarchical semantic IDs via 3-level k-means clustering on audio features, then sessions are modeled as token sequences in a decoder-only transformer trained with causal language modeling. Ensembled with LightGBM for final predictions. Achieved the best average performance rank (6th in skip prediction and 3rd in CLV score (the second task which I didn’t work on))