Features

Deep dive into LegalEase capabilities and AI pipelines.

This reference explains how the major subsystems work and what to expect from each one.

Document Processing Pipeline

LegalEase processes documents through a multi-stage pipeline to make them searchable and reviewable.

Processing Flow

  1. Upload - Files are uploaded to Firebase Storage under users/{userId}/documents/{documentId}/
  2. Trigger - A Firestore trigger detects the new document and starts processing
  3. Extraction - Docling extracts text, structure, tables, and bounding boxes. OCR runs automatically for scanned pages.
  4. Chunking - Content is split into hierarchical chunks:
    • summary - Coarse overviews for jumping into large files
    • section - Medium sections (~500 tokens)
    • paragraph - Smaller blocks for precise matching
  5. Embedding - Gemini generates dense vector embeddings for each chunk
  6. Indexing - Vectors are stored in Qdrant with metadata for hybrid search
  7. Page Renders - PDF pages are rendered as images for the viewer

Document Viewer

The built-in viewer provides:

  • Page-by-page navigation with thumbnails
  • Search hit highlighting using extracted bounding boxes
  • Text layer overlay for copy/paste
  • Entity sidebar showing extracted people, organizations, dates
  • Chunk metadata for debugging

Supported Formats

FormatSupport
PDFFull extraction with OCR
DOCXText and structure extraction
Images (PNG, JPG)OCR extraction
HTML/MarkdownDirect text extraction

Transcription Pipeline

Audio and video files are transcribed using frontier AI models with speaker diarization.

Providers

LegalEase supports two transcription providers:

Gemini 2.5 Flash (default)

  • Works with Firebase Storage emulator
  • Automatic speaker diarization via prompt engineering
  • Speaker name inference from conversational context
  • Supports files up to 9.5 hours
  • Structured JSON output with timestamps

Google Speech-to-Text (Chirp 3)

  • Requires production GCS (not compatible with emulator)
  • Native speaker diarization
  • Higher accuracy for some audio types
  • Supports files up to 8 hours via BatchRecognize

Processing Flow

  1. Upload - Audio/video uploaded to Firebase Storage
  2. Trigger - Firestore trigger starts transcription job
  3. Provider Selection - Based on TRANSCRIPTION_PROVIDER setting
  4. Transcription - Provider generates segments with:
    • Start/end timestamps
    • Speaker identification
    • Transcript text
    • Confidence scores (Chirp only)
  5. Speaker Inference - Names inferred from conversation
  6. Summarization - Gemini generates summary, key moments, entities
  7. Waveform - Audio peaks extracted for visual player (in progress)

Output Format

{
  fullText: string,
  segments: [{
    id: string,
    start: number,    // seconds
    end: number,
    text: string,
    speaker: string   // "Speaker 1", "Speaker 2", etc.
  }],
  speakers: [{
    id: string,
    inferredName?: string  // "John", "Jane", etc.
  }],
  duration: number,
  language: string,
  summarization: {
    summary: string,
    keyMoments: [...],
    actionItems: [...],
    topics: [...],
    entities: {...}
  }
}

Summarization

All transcripts are automatically analyzed using Gemini 2.5 Flash.

Output Components

Executive Summary

  • 1-2 paragraph overview of the conversation
  • Focus on key facts and outcomes

Key Moments

  • Timestamped highlights with importance ratings (high/medium/low)
  • Click to jump directly to that point in the audio

Action Items

  • Follow-up tasks mentioned or implied
  • Extracted automatically from conversation

Topics

  • Main subjects discussed
  • Useful for categorization and filtering

Entities

  • People - Names mentioned in the conversation
  • Organizations - Companies, agencies, firms
  • Locations - Places referenced
  • Dates - Dates, deadlines, timeframes mentioned

LegalEase combines semantic and keyword search using Qdrant vector database.

How It Works

  1. Query Processing
    • Gemini generates dense vector embedding
    • BM25 generates sparse keyword vector
  2. Search Execution
    • Both vectors query Qdrant simultaneously
    • Results from each method are retrieved
  3. Fusion
    • Reciprocal Rank Fusion (RRF) combines results
    • Balances semantic understanding with keyword precision
  4. Filtering
    • Results filtered by case, document type, date range
    • Permissions applied based on user/team

Search Modes

ModeDescription
HybridBest of both - semantic understanding + keyword precision
SemanticConceptual matching - finds related content even without exact words
KeywordTraditional BM25 - exact term matching

Indexed Content

  • Document chunks (all granularities)
  • Transcript segments
  • Summaries and key moments
  • Entity mentions

Firebase Integration

LegalEase leverages Firebase for a serverless, real-time architecture.

Services Used

Cloud Firestore

  • Primary database for cases, documents, transcripts
  • Real-time listeners for instant UI updates
  • Automatic offline support

Firebase Storage

  • File storage for uploads
  • Secure, authenticated access
  • Resumable uploads for large files

Cloud Functions

  • Genkit-based AI flows
  • Firestore triggers for background processing
  • Scales automatically

Firebase Auth

  • Google sign-in
  • Email/password authentication
  • Session management

Real-Time Updates

All data uses Firestore real-time listeners:

  • Document processing status updates instantly
  • Transcript completion triggers immediate UI refresh
  • Team collaboration sees changes in real-time

AI Provider Architecture

LegalEase uses a provider abstraction for flexibility:

functions/src/transcription/
├── provider.ts      # Interface definition
├── types.ts         # Shared types
├── registry.ts      # Provider registration
├── index.ts         # Public API
└── providers/
    ├── gemini.ts    # Gemini 2.5 Flash
    └── chirp.ts     # Google Speech-to-Text

Adding New Providers

  1. Implement TranscriptionProvider interface
  2. Register in registry.ts
  3. Provider automatically available via TRANSCRIPTION_PROVIDER env var

This pattern extends to future AI providers (OpenAI, Anthropic, local models).

Built with Nuxt UI • LegalEase AI © 2025