top of page

Emerging Architectures of LLM Applications 2025
Watch the full webinar, "Building the Future of AI: Emerging Architectures of LLM Applications in 2025
TensorOps
Mar 31 min read

Deepseek - Open Source Revolution in AI Models - (AI Lover podcast)
The conversation explores the emergence of DeepSeek as a significant player in the AI landscape, particularly in the context of open-source
TensorOps
Feb 191 min read

DeepSeek-V3 Technical Analysis - MoE, Fine-Grained Quantization, DualPipe, MLA
Analysis of Performance and Technical Innovations. Dive into Mixture of Experts (MoE), Fine-Grained Quantization, DualPipe, MLA and more.
Miguel Carreira Neves
Feb 1313 min read

Vector DBs Will Not Save Your RAG
The AI world has collectively fixated on Vector Databases as the holy grail for scalable, accurate information retrieval and synthesis to so
Diogo Gonçalves
Jan 65 min read

Agents Are Just Long-Running Jobs: A Pragmatic View of an Overhyped AI
The Routing Workflow. Source: Anthropic "Building effective agents" Building an “AI agent” sounds exciting —visions of an autonomous...
Gad Benram
Dec 25, 20246 min read

Emerging Architectures of LLM Applications (2025 Update)
The world of AI applications is changing rapidly. Not too long ago, most AI systems were simple: a single model received input, made a...
TensorOps
Dec 8, 20244 min read

Faster Than XGBoost: Using Catboost with C++
Integrating machine learning models into production environments often requires a balance between performance, compatibility, and ease of...
Gad Benram
Dec 5, 20245 min read

Contextual Retrieval - Enhancing RAG Performance
Traditional RAG systems cannot maintain context in retrieved information. Contextual Retrieval addresses this by enriching data with context
Miguel Carreira Neves
Nov 7, 20246 min read

Deploying LLM Proxy on Google Kubernetes Engine: A Step-by-Step Guide
In our previous post , we explored the concept of an LLM Proxy and its importance in scalable LLM application architectures. In this...
Diogo Azevedo
Oct 18, 20243 min read
bottom of page