Mapping Layer Similarity in LLMs with RSA & CKA
Overview
This project applies Representational Similarity Analysis (RSA) and Centered Kernel Alignment (CKA) to study layer-wise similarity patterns in decoder-only LLMs across parallel English–Hindi prompts. For each layer, activations are standardized, a Gram matrix is built and double-centered, then flattened and L2-normalized into a CKA signature; cosine similarity between signatures recovers linear CKA. Stacking signatures and eigendecomposing their similarity matrix yields a 3D joint projection that makes cross-lingual alignment directly visible.
Cross-lingual alignment (EN↔HI)
| Model | CKA | 3D distance | Layers |
|---|---|---|---|
| GPT-OSS-20B | 0.698 | 0.407 | 24 |
| Llama-3.2-1B | 0.684 | 0.519 | 16 |
| Llama-3.2-3B | 0.695 | 0.556 | 28 |
| Llama-3.1-8B | 0.653 | 0.603 | 32 |
A single language axis emerges in the joint projection (corr(PC2, language) ≈ 0.93). GPT-OSS-20B shows the strongest cross-lingual alignment — highest CKA and lowest 3D distance — while larger Llama models exhibit increased language-specific drift in deeper layers.
Citation
@misc{bompilwar2025mapping,
title = {Mapping Layer Similarity in Large Language Models with RSA & CKA},
author = {Bompilwar, Ritik},
year = {2025},
howpublished = {\url{https://github.com/RITIK-12/Llama_RSA}}
}