Docker Configuration for pingcap/autoflow

Docker Configuration for PingCAP Autoflow Development Environment

Docker plays a crucial role in setting up the development environment for PingCAP Autoflow by allowing for containerized services that can be easily managed and configured. Below is an outline of the process to configure Docker for local development without touching production aspects.

Step 1: Setting Up `docker-compose.yml`

The docker-compose.yml file defines the services for the development environment. Below is a breakdown of its components:

version: '3'
services:
  redis:
    image: redis:6.0.16
    restart: always
    volumes:
      - ./redis-data:/data
    command: ["redis-server", "--loglevel", "warning"]

  backend:
    image: tidbai/backend:0.2.8
    restart: always
    depends_on:
      - redis
    ports:
      - "8000:80"
    env_file:
      - .env
    volumes:
      - ./data:/shared/data
    logging:
      driver: json-file
      options:
        max-size: "50m"
        max-file: "6"

  frontend:
    image: tidbai/frontend:0.2.8
    restart: always
    depends_on:
      - backend
    ports:
      - 3000:3000
    environment:
      BASE_URL: http://backend
    logging:
      driver: json-file
      options:
        max-size: "50m"
        max-file: "6"

  background:
    image: tidbai/backend:0.2.8
    restart: always
    depends_on:
      - redis
    ports:
      - "5555:5555"
    env_file:
      - .env
    volumes:
      - ./data:/shared/data
    command: /usr/bin/supervisord
    logging:
      driver: json-file
      options:
        max-size: "50m"
        max-file: "6"

  local-embedding-reranker:
    image: tidbai/local-embedding-reranker:v3-with-cache
    ports:
      - 5001:5001
    environment:
      - PRE_LOAD_DEFAULT_EMBEDDING_MODEL=true
      - PRE_LOAD_DEFAULT_RERANKER_MODEL=false
      - TRANSFORMERS_OFFLINE=1
    profiles:
      - local-embedding-reranker

Redis Service: Utilizes Redis for caching and message queuing.
- volumes: Maps local directory ./redis-data to /data in the container for persistent storage.
Backend Service: Runs the backend application.
- depends_on: Ensures Redis is up before starting.
- volumes: Maps ./data for shared data between host and container.
Frontend Service: Runs the frontend application.
- environment: Sets the BASE_URL to communicate with the backend.
Background Service: Runs a background process using supervisord.
Local Embedding Reranker: Runs embedding reranking with support for NVIDIA GPU if needed (options commented).

Step 2: Creating the Dockerfile

The Dockerfile describes how to build an image for the backend service. Below is an essential setup:

FROM tiangolo/uvicorn-gunicorn-fastapi:python3.11-slim

WORKDIR /app/

RUN apt-get update && apt-get install -y supervisor

COPY supervisord.conf /usr/etc/supervisord.conf
COPY requirements.lock /app/requirements.lock

RUN PYTHONDONTWRITEBYTECODE=1 pip install --no-cache-dir -r /app/requirements.lock && \
    playwright install chromium && \
    playwright install-deps chromium

RUN python -c 'import nltk; \
download_dir = "/usr/local/lib/python3.11/site-packages/llama_index/core/_static/nltk_cache";\
nltk.download("stopwords", download_dir=download_dir);\
nltk.download("punkt", download_dir=download_dir);'

ENV PYTHONPATH=/app

COPY . /app/

Base Image: Uses tiangolo/uvicorn-gunicorn-fastapi for FastAPI applications.
Working Directory: Sets /app/ as the working directory.
Install Dependencies: Installs the required system packages and Python dependencies listed in requirements.lock.
NLTK Resources: Downloads NLTK data for text processing.
Copy Application Code: Copies the entire application code into the container.

Step 3: Running the Docker Configuration

After configuring docker-compose.yml and the Dockerfile, start the development environment using:

docker-compose up --build

This command builds the images and starts all defined services. To run in detached mode, append -d to the command.

Step 4: Accessing Services

The backend can be accessed on http://localhost:8000.
The frontend will be available on http://localhost:3000.
The background service is exposed on port 5555, and the local embedding reranker can be accessed via http://localhost:5001.

These setup procedures enable a robust environment for development within the PingCAP Autoflow ecosystem.

References

File: docker-compose.yml
File: Dockerfile