Question 1

How is Moorcheh different from Pinecone / Qdrant / Weaviate?

Accepted Answer

Traditional vector databases store and search full-precision embeddings on always-on clusters. Moorcheh compresses embeddings 32× using information-theoretic binarization and searches them on serverless functions. No clusters, no GPUs, no infrastructure running when you're not querying. It's a different architecture, not a different database.

Question 2

What does "deterministic retrieval" mean?

Accepted Answer

HNSW-based vector databases return approximate nearest neighbors — they're fast but not guaranteed to find the actual best match. Moorcheh's exact bitwise scan returns the true nearest neighbors every time. 100% recall, mathematically guaranteed. This matters for regulated industries where retrieval accuracy is a compliance requirement.

Question 3

Is MIB just another form of Quantization or LSH?

Accepted Answer

No. Locality Sensitive Hashing (LSH) and standard quantization use random projections that degrade accuracy for speed. Maximally Informative Binarization (MIB) is an information-theoretic transformation. It identifies and encodes the most salient, high-entropy features of a vector into a compact binary code. This allows us to replace the heavy float32 representation entirely, achieving 32x compression while increasing semantic focus on what actually matters in the query.

Question 4

You claim "Geometric Distance" is fragile. What does that mean?

Accepted Answer

The industry relies on the assumption that geometric proximity (Cosine Similarity) equals semantic relevance. We prove this is a "fragile proxy." Just because two vectors point in the same direction doesn't mean they answer the user's specific intent. Moorcheh replaces geometric search with Information-Theoretic Scoring (ITS). Instead of measuring angles, ITS measures the probability of relevance.

Real-world Benchmark:
Query: "Tesla Customer Segments"

❌ Geometric Search: Returned generic results about "men"
✓ Moorcheh ITS: Returned the specific demographic "men aged 30-40"

ITS recognized the specific information gain required by the query.

Question 5

How does the "3-Part Stack" replace traditional RAG?

Accepted Answer

We don't optimize the old stack; we replace it component-by-component:

Storage:Float32 vectors→MIB Binary Codes(32x smaller)

Indexing:HNSW Graph Build→Index-Free (Pure Transform)(Instant writes)

Search:Cosine Similarity→EDM (Bitwise Ops)(CPU-optimized)

Question 6

How does Moorcheh compare to Pinecone, Qdrant, and Redis?

Accepted Answer

Metric	Moorcheh	Pinecone / Weaviate	AWS OpenSearch
Write Latency	Instant (Transform) (No Build Time)	Slow (Graph Build) (Re-indexing Lag)	Slow (Graph Build)
Real-Time Data	Native Support (Streaming Ready)	Re-indexing Lag	Re-indexing Lag
Storage Cost	$0 / Month	$$$ / GB / Month	$$$ / GB / Month
Scaling	Auto-Concurrency	Manual Sharding	Slow I/O Bound

Question 7

How does Moorcheh compare to "Storage-Separated" solutions (like AWS S3 Vectors)?

Accepted Answer

Storage-separated solutions (like S3 or PGVector) solve the cost problem but introduce High Latency and Low Accuracy. They have to fetch data from cold storage and perform brute-force search. Moorcheh breaks this trade-off. Our MIB compression makes the data small enough to be "Hot" (instantly searchable) while keeping the cost profile of "Cold" storage.

Question 8

How do you handle Hybrid Search and Filtering?

Accepted Answer

We streamline Hybrid Search into a single Syntax-Driven Query. Instead of building complex JSON filter objects, you can combine semantic search, metadata filters, and keyword constraints directly in the query string.

For example: query: 'payment failure #platform:mobile #urgent'

Question 9

What is the difference between Text and Vector Namespaces?

Accepted Answer

We support two core operating modes:

Text Namespaces (Managed): You upload raw text or files. Moorcheh handles the chunking, embedding, and indexing automatically. Best for "Auto-ETL."

Vector Namespaces (BYO-Model): You handle the embedding and upload raw vector arrays [0.1, 0.2...]. Best if you have fine-tuned models or specific embedding requirements.

Question 10

If I use Vector Namespaces (BYO-Embedding), how do I get the text back for my LLM?

Accepted Answer

You do not need a secondary database. When uploading your raw vectors, you can attach the original text chunk as a Metadata Field. When you perform a vector search, Moorcheh returns the payload (metadata) alongside the ITS score.

Question 11

Can I search across multiple namespaces simultaneously?

Accepted Answer

Yes. The search endpoint accepts an array of namespaces ["legal-docs", "emails", "slack-logs"]. The engine will perform a parallel search and rank the results globally using the ITS score.

Question 12

Do I need to manage my own RAG chain (Search + LLM)?

Accepted Answer

No. Our /answer endpoint consolidates the entire RAG pipeline—Embedding, Search, Reranking, Prompting, and Generation—into a single API call.

Question 13

How do I switch between RAG and standard Chat?

Accepted Answer

Use the namespace field. If you provide a namespace, we perform RAG. If you send namespace: "" (empty string), we act as a Direct AI Gateway.

Question 14

What models are supported?

Accepted Answer

We support a multi-provider ecosystem including:

anthropic.claude-sonnet-4.5meta.llama4-maverickdeepseek.r1amazon.nova-pro

You are not locked into one provider.

Question 15

Can I customize the prompts?

Accepted Answer

Yes. Use headerPrompt to define the AI persona and footerPrompt to define formatting rules. We handle context injection automatically.

Question 16

Does Moorcheh support "Multi-Hop" reasoning?

Accepted Answer

Yes. Our engine supports Recursive Retrieval. If an agent asks a complex question, Moorcheh can retrieve an initial set of documents, extract new entities from them, and automatically trigger a secondary search—all within the same API latency budget.

Question 17

How does the "Long-Term Memory" actually persist? Is it a graph?

Accepted Answer

It is a hybrid approach. We store conversational state as time-series semantic vectors. This allows an agent to recall "What did we agree on last Tuesday?" (Temporal Recall) as well as "What are the user's dietary restrictions?" (Fact Recall).

Question 18

Can multiple agents share the same memory?

Accepted Answer

Yes. We support Multi-Tenant Namespaces. You can create a shared "Global Memory" namespace for all agents and individual "Private Memory" namespaces for specific user sessions. You can query across both simultaneously.

Question 19

Can I run Moorcheh locally (e.g., Docker)?

Accepted Answer

Moorcheh is a cloud-first semantic engine. We do not currently offer a local Docker container. However, the Moorcheh Console Playground acts as your "Local Host" in the cloud — a free, zero-latency environment to build, test, and debug your entire RAG pipeline before you ever write a line of code.

Question 20

Do I need external tools to parse and chunk my files?

Accepted Answer

No. Our /upload-file endpoint accepts raw files directly:

.pdf.docx.xlsx.json.csv.md.txt

We handle the full ingestion lifecycle: OCR parsing → metadata enrichment → intelligent chunking → MIB encoding.

Question 21

Can I upload structured data like Excel or JSON?

Accepted Answer

Yes. Unlike basic vector stores, Moorcheh natively parses .xlsx and .json files, preserving structural relationships for semantic search over structured datasets.

Question 22

How do I debug performance or hallucinations?

Accepted Answer

The Moorcheh Console is a complete Observability Suite. You get granular logs including latency breakdowns (Embedding vs. Search vs. Generation), full action logs, and real-time uptime monitoring per namespace.

Question 23

Can I build a chatbot without writing code?

Accepted Answer

Yes. The Visual Playground allows you to configure the entire system visually — select your Knowledge Base, choose the AI Model, set the prompts, and customize the Chat UI. Chat with your data immediately in the browser, free of charge, to validate accuracy.

Question 24

How do I move from the Playground to my own app?

Accepted Answer

We offer a "One-Click Export" workflow. Once you're happy with your setup, download a single Configuration File and drop it into our open-source Chat Boilerplate. Your production-ready app works on Vercel, Netlify, or your own servers.

Question 25

Does serverless mean slow cold starts?

Accepted Answer

No. Moorcheh's 32× compressed binary codes are small enough to load from DynamoDB into a Lambda function in single-digit milliseconds. Our benchmarks show 9.6ms average retrieval latency across 14 industry-standard datasets — 9× faster than Qdrant running on a dedicated cluster.

Question 26

Many vector DBs claim to be "Serverless." How is Moorcheh different?

Accepted Answer

Most "Serverless" vector databases are actually managed servers hidden behind an API. Moorcheh is True Serverless. MIB compresses the index so aggressively that we load the search engine inside the stateless function itself (AWS Lambda, Google Cloud Run), eliminating the network hop to a separate vector server.

Question 27

Do I have to trade off accuracy for QPS (Queries Per Second)?

Accepted Answer

No. Because Moorcheh uses EDM (Bitwise Operations), our search is orders of magnitude cheaper than floating-point math. We maintain maximum search breadth and accuracy even under massive QPS loads, eliminating the "Accuracy vs. Speed" compromise.

Question 28

Can I stream live data (e.g., stock ticks, logs) into an agentic workflow?

Accepted Answer

Yes. Because we use an Index-Free Architecture, Moorcheh supports Real-Time Ingestion. MIB is a pure transformation — we simply transform and append. Data is searchable the exact millisecond it is uploaded.

Question 29

How long does it take to ingest massive datasets?

Accepted Answer

Ingestion is bound only by network speed, not CPU. Since we don't build an HNSW graph, our ingestion throughput is orders of magnitude higher. There is no "Build Time" in Moorcheh — data is live instantly.

Question 30

Can I try it before committing to sovereign deployment?

Accepted Answer

Yes. Start on our managed SaaS — same API, same performance. When you're ready for sovereign deployment, migrate to your VPC with zero code changes.

Question 31

I'm currently using Pinecone/Weaviate. How hard is it to migrate?

Accepted Answer

We offer a "Zero-Downtime Migration" tool. If you can export your data to JSON/Parquet, we can bulk-ingest it. We support both vector-only migration and raw text re-ingestion to take advantage of our optimized embedding models.

Question 32

Do you offer a Sandbox or Free Tier?

Accepted Answer

Yes. The Builder Plan is free forever. It includes up to 10,000 vectors and full API access. It is not a "trial" — it is a functional tier for prototyping and small internal tools.

Question 33

Where does my data live?

Accepted Answer

In your VPC, on your cloud account, under your control. Moorcheh deploys entirely inside your AWS, GCP, or Azure environment. No data leaves your network. No external APIs process your documents or queries — including re-ranking, which is built in.

Question 34

How long does deployment take?

Accepted Answer

Under 10 minutes. Moorcheh deploys 428 coordinated cloud-native assets (Lambda functions, DynamoDB tables, S3 buckets, API Gateway, IAM roles, CloudWatch dashboards) into your VPC via CDK, Terraform, or ARM templates.

Question 35

What does it actually cost?

Accepted Answer

You pay for the Moorcheh license plus the AWS/GCP/Azure serverless resources you consume. Serverless means you pay per query — not per hour. When nobody's searching, your infrastructure cost is $0. A deployment handling 766 million queries per year runs at roughly $18,000/yr in cloud costs. See our pricing page for license details.

Question 36

Do I pay for storage (Data at Rest)?

Accepted Answer

No. Unlike other vector databases that charge monthly rent for SSD/RAM, Moorcheh provides Zero Recurring Storage Costs. You only pay a small one-time ingestion fee ($0.01 per 10MB). Once indexed, your data sits in Moorcheh forever for free.

Question 37

How does the Credit System work?

Accepted Answer

We use a simple consumption model where 1 Credit = $0.01.

Ingestion:1 Credit per 10MB (One-time)

Search:5 Credits per 1,000 API calls

Gen AI:1–3 Credits per answer, depending on model tier

No idle pods or hourly fees. If you don't use the API, your bill is $0.

Question 38

Which Gen AI models are included in which tier?

Accepted Answer

We offer three tiers:

Standard (1 Credit)

Llama 3 70B, DeepSeek-R1 — Great for speed/code

Advanced (2 Credits)

Claude 3.7 Sonnet, Amazon Nova Pro — Great for reasoning

Premium (3 Credits)

Claude Sonnet 4, Llama 4 Maverick — 1M+ Context

Question 39

Why is your TCO lower than Pinecone or Redis?

Accepted Answer

Traditional vector databases are RAM-bound — they require expensive "Provisioned RAM" to keep your HNSW graph in memory 24/7, even when idle. Moorcheh is Stateless & Serverless. Our binarized index is stored cheaply on disk and loaded instantly during execution. You pay for Query Execution, not idle RAM.

Question 40

Do I need to manage Sharding or "Pods" as I scale?

Accepted Answer

No. Sharding is a legacy constraint of the HNSW architecture. Moorcheh uses Automatic Function Concurrency — we scale the way AWS Lambda does: by spinning up more stateless execution environments instantly. No pods to provision, no shards to balance.

Question 41

Do you offer SLAs and Enterprise Support?

Accepted Answer

Yes. Our Production Plan includes a 99.9% Uptime SLA. Our Enterprise/Sovereign Plan includes a 99.99% SLA, dedicated support channels (Slack/Teams), and designated Solutions Architect hours.

Question 42

What happens if I go over my plan limits?

Accepted Answer

We do not hard-stop your service (unless on the Free tier). For Production plans, we have a Usage-Based Overage model. You simply pay for the extra compute at the standard metered rate. We will alert you, but we won't break your app.

One Platform. Four Layers.Zero Clusters.

Everything Moorcheh Offers

Managed SaaS

Sovereign VPC

On-Prem

On Edge

Memanto

Integrations

Technical Deep Dive

Core Technology & Accuracy