Blog - Page 4 of 6 - Arda Tuğsat

General March 22, 2026

Prompt Injection and Adversarial Robustness in Large Language Models: Attack Surfaces, Threat Models, and Defense Mechanisms

Abstract Prompt injection represents one of the most fundamental and under-theorized security vulnerabilities in deployed large language model (LLM) systems.…

General March 22, 2026

Cross-Lingual Transfer in Multilingual Language Models: Representation Alignment, Zero-Shot Generalization, and the Curse of Multilinguality

Abstract Multilingual language models (MLMs) such as mBERT, XLM-R, and mT5 have demonstrated a remarkable and theoretically underexplained capability: fine-tuning…

General March 22, 2026

Constitutional AI and RLAIF: Scalable Alignment Through Self-Critique, Principle-Guided Feedback, and AI-Generated Supervision

Abstract Constitutional AI (CAI) and Reinforcement Learning from AI Feedback (RLAIF) represent a significant departure from classical RLHF pipelines: rather…

General March 22, 2026

The Evaluation Benchmark Saturation Problem: Contamination, Ceiling Effects, and the Measurement Crisis in NLP

Abstract Evaluation benchmarks have long served as the primary currency of progress in natural language processing and machine learning. Yet…

General March 22, 2026

Neural Network Loss Landscape Geometry: Saddle Points, Sharp Minima, and the Topology of Optimization

Abstract The geometry of the loss landscape that a neural network traverses during training is fundamental to understanding why deep…

General March 22, 2026

The Perceptron Convergence Theorem: Geometric Foundations, Proof Mechanics, and Modern Implications for Neural Network Theory

Abstract The perceptron convergence theorem, established by Rosenblatt in 1962 and later formalized by Novikoff, remains one of the most…

General March 21, 2026

Tokenization Effects on Downstream Task Performance: Vocabulary Design, Subword Granularity, and the Hidden Bottleneck in NLP Pipelines

Abstract Tokenization—the process of segmenting raw text into discrete units for neural processing—is the least-scrutinized component of modern NLP pipelines…

General March 21, 2026

RAG vs. Long-Context LLMs: Empirical Tradeoffs, Architectural Constraints, and the Future of Knowledge-Intensive Inference

Abstract Retrieval-Augmented Generation (RAG) and long-context large language models (LLMs) represent two competing paradigms for integrating external knowledge into generative…

General March 21, 2026

Knowledge Distillation Loss Functions: A Comparative Analysis of KL Divergence, Intermediate Layer Objectives, and Modern Variants

Abstract Knowledge distillation (KD) has emerged as a foundational technique for compressing large neural networks into smaller, deployment-ready student models…