Blog - Page 3 of 6 - Arda Tuğsat

General March 22, 2026

Agentic AI Systems: Planning, Memory, and Tool Use in Autonomous Large Language Model Agents

Abstract The emergence of large language models (LLMs) capable of reasoning, tool invocation, and multi-step planning has precipitated a new…

General March 22, 2026

Vision-Language Models: Contrastive Alignment, Cross-Modal Attention, and the Architecture of Multimodal Understanding

Abstract Vision-language models (VLMs) have emerged as one of the most consequential developments in modern deep learning, enabling systems to…

General March 22, 2026

State Space Models and Mamba: Linear-Time Sequence Modeling, Selective Mechanisms, and the Challenge to Transformer Dominance

Abstract The transformer architecture has dominated sequence modeling for half a decade, but its $O(n^2)$ attention complexity imposes a hard…

General March 22, 2026

Rotary Position Embeddings (RoPE): Theory, Geometry, and the Future of Position Encoding in Transformers

Abstract Position encoding is a foundational design choice in transformer architectures, enabling models to exploit token order without recurrence. Rotary…

General March 22, 2026

Parameter-Efficient Fine-Tuning of Large Language Models: LoRA, Adapters, and the Mathematics of Low-Rank Adaptation

Abstract Full fine-tuning of large language models (LLMs) has become computationally prohibitive at scales exceeding tens of billions of parameters.…

General March 22, 2026

Sparse Attention Mechanisms: Local Windows, Learned Sparsity Patterns, and the Quadratic Complexity Problem in Transformers

Abstract The standard self-attention mechanism in Transformer architectures exhibits quadratic time and memory complexity in sequence length, forming a fundamental…

General March 22, 2026

Quantization of Large Language Models: Post-Training Methods, Quantization-Aware Training, and the Precision-Performance Tradeoff

Abstract Model quantization — the process of representing neural network weights and activations with reduced numerical precision — has become…

General March 22, 2026

Neural Architecture Search: Efficiency Methods, Search Space Design, and the Bias-Efficiency Tradeoff

Abstract Neural Architecture Search (NAS) automates the discovery of high-performing neural network architectures, offering a principled alternative to manual design.…

General March 22, 2026

Federated Learning for Privacy-Preserving NLP: Communication Efficiency, Heterogeneity, and the Limits of Local Differential Privacy

Abstract Federated learning (FL) offers a compelling paradigm for training natural language processing models across distributed clients without centralizing raw…