privacy-filter

OpenAI Privacy Filter is a 1.5B‑parameter, 50M‑active Apache‑2.0 token‑classification model that tags and redacts eight categories of PII over 128k tokens in a single pass, enabling local‑first privacy for logs, documents, and chat data.

Custom Model

Chat

How to Get Started

Step 1

Step 2

Paste the URL from Huggingface repository: https://huggingface.co/openai/privacy-filter

Step 3

Choose the GPU machine to deploy.

That’s all! You’re ready to use the model in few minutes without infrastructure complexity in few minutes.

How can this model be used?

You can use privacy‑filter as a preprocessing layer that scans raw text and either masks, drops, or pseudonymizes sensitive spans before they touch downstream LLMs, storage, or analytics systems. It is designed to plug into ingestion pipelines, RAG indexing jobs, logging infrastructure, and enterprise workflows where engineering and compliance teams need tunable recall/precision and full control over how detected PII is handled

Applications & Use Cases

Ingestion‑time sanitization for product logs, support transcripts, and user content so that databases, data warehouses, and vector stores never see raw PII.
Preparing corpora for model training, evaluation, and internal sharing by masking or pseudonymizing sensitive spans while preserving structure and utility.
Local‑first redaction in browsers or desktop apps (via Transformers.js or WebGPU) before text is sent to cloud LLM APIs like ChatGPT, Claude, or Gemini.
Building privacy gateways or middleware that sit in front of multiple AI providers and enforce organization‑specific privacy policies over eight PII categories.
Fine‑tuned, policy‑aware filters for regulated sectors (finance, healthcare, public sector) where different thresholds and masking strategies are required per field type.