top of page

Self-Hosting Large Language Models in China: Limitations and Possibilities
While cloud vendors are ramping up their supply of AI models and GPU hardware in China, companies that are required to self-host these...
Bernardo Pissarra
5 days ago21 min read

Emerging Architectures of LLM Applications 2025
Watch the full webinar, "Building the Future of AI: Emerging Architectures of LLM Applications in 2025
TensorOps
Mar 31 min read

Deepseek - Open Source Revolution in AI Models - (AI Lover podcast)
The conversation explores the emergence of DeepSeek as a significant player in the AI landscape, particularly in the context of open-source
TensorOps
Feb 191 min read

DeepSeek-V3 Technical Analysis - MoE, Fine-Grained Quantization, DualPipe, MLA
Analysis of Performance and Technical Innovations. Dive into Mixture of Experts (MoE), Fine-Grained Quantization, DualPipe, MLA and more.
Miguel Carreira Neves
Feb 1313 min read

Vector DBs Will Not Save Your RAG
The AI world has collectively fixated on Vector Databases as the holy grail for scalable, accurate information retrieval and synthesis to so
Diogo Gonçalves
Jan 65 min read

Agents Are Just Long-Running Jobs: A Pragmatic View of an Overhyped AI
The Routing Workflow. Source: Anthropic "Building effective agents" Building an “AI agent” sounds exciting —visions of an autonomous...
Gad Benram
Dec 25, 20246 min read

Emerging Architectures of LLM Applications (2025 Update)
The world of AI applications is changing rapidly. Not too long ago, most AI systems were simple: a single model received input, made a...
TensorOps
Dec 8, 20244 min read

Faster Than XGBoost: Using Catboost with C++
Integrating machine learning models into production environments often requires a balance between performance, compatibility, and ease of...
Gad Benram
Dec 5, 20245 min read

Contextual Retrieval - Enhancing RAG Performance
Traditional RAG systems cannot maintain context in retrieved information. Contextual Retrieval addresses this by enriching data with context
Miguel Carreira Neves
Nov 7, 20246 min read

Deploying LLM Proxy on Google Kubernetes Engine: A Step-by-Step Guide
In our previous post , we explored the concept of an LLM Proxy and its importance in scalable LLM application architectures. In this...
Diogo Azevedo
Oct 18, 20243 min read


Cohort-Based Forecasting: A Technical Deep Dive
At TensorOps , we specialize in implementing AI solutions that drive business growth. One powerful application of AI in the business...
Higor Ribeiro de Oliveira
Oct 17, 20244 min read

Building AI and LLM Agents from the Ground Up: A Step-by-Step Guide
OpenAI’s vision of creating artificial general intelligence (AGI) might still be futuristic, but today’s AI agents are already making a sign
Clara Gadelho
Oct 14, 20249 min read

Comparing Context Caching in LLMs: OpenAI vs. Anthropic vs. Google Gemini
Compare context caching in LLMs—OpenAI, Anthropic, Google Gemini. Discover the best option for your project's cost, ease, and features.
Bruno Alho
Oct 14, 20244 min read
10 Essential AI Technologies for Software Supply Chain Companies
Table of Contents Introduction The Software Supply Chain AI in Software Development: The Rise of Code Assistants...
Gad Benram
Oct 13, 20245 min read

Knowledge Graph RAG vs. Vector DB RAG: Is It Time for GraphDBs to Shine?
The emergence of AI has revolutionized the way we interact with data—or even knowledge itself. Among the buzzwords circulating in the...
Gad Benram
Oct 13, 20246 min read

Moving from Chatbots to Agents
While the terms “Chatbot” and “AI agent” are sometimes used interchangeably, there are notable differences between them:
Diogo Gonçalves
Oct 8, 20246 min read

Prompt Translation: The Way to Switch Between LLMs Without Losing Performance
Since the debut of ChatGPT in 2023, the landscape of Large Language Models (LLMs) has evolved dramatically. Back then, the primary...
Gad Benram
Sep 24, 20245 min read

UX in LLM Applications: Examples of 4 Companies Getting It Right and 1 That Missed the Mark
Over the past year, TensorOps has observed a recurring scenario: organizations invest significant time—often 5-7 months—fine-tuning...
Gad Benram
Sep 20, 20244 min read

OpenAI Unveils o1 Model: The Biggest Leap Towards AGI since ChatGPT
September 12, 2024 OpenAI has unveiled its latest breakthrough in artificial intelligence—the O1 model series—now available in Preview....
Gad Benram
Sep 12, 20243 min read

What can shift Nvidia's stock up or down?
NVIDIA's stock soared 2750% due to $10B in Q2 data center sales. Buyers like Google and Meta aim to leverage AI tech, but is it a bubble?
Gad Benram
Aug 31, 20245 min read
bottom of page