Production deployment for docker/genai-stack

This guide outlines the step-by-step process for deploying the docker/genai-stack in a production environment. Assume familiarity with Docker, Docker Compose, and the services involved.

Prerequisites

Docker installed on your production server.
Docker Compose for orchestrating multi-container Docker applications.
Proper system resources allocated to handle the deployment.

Step 1: Configure Environment Variables

Before moving forward, ensure that the environment variables are properly configured. Create a .env file in your project root if it is not already present:

OLLAMA_BASE_URL=http://your-ollama-url
NEO4J_USERNAME=your-neo4j-username
NEO4J_PASSWORD=your-neo4j-password
OPENAI_API_KEY=your-openai-key
GOOGLE_API_KEY=your-google-key
LANGCHAIN_API_KEY=your-langchain-key
AWS_ACCESS_KEY_ID=your-aws-access-key
AWS_SECRET_ACCESS_KEY=your-aws-secret-key
AWS_DEFAULT_REGION=your-aws-region

Make sure to replace the placeholders with appropriate values.

Step 2: Create `docker-compose.yml`

The following is the core structure for your docker-compose.yml file. This example incorporates various services, including the LLM, pull-model, database, loader, and others:

version: '3.8'

services:
  llm: &llm
    image: ollama/ollama:latest
    profiles: ["linux"]
    networks:
      - net

  llm-gpu:
    <<: *llm
    profiles: ["linux-gpu"]
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

  pull-model:
    image: genai-stack/pull-model:latest
    build:
      context: .
      dockerfile: pull_model.Dockerfile
    environment:
      - OLLAMA_BASE_URL=${OLLAMA_BASE_URL}
      - LLM=${LLM-llama2}
    networks:
      - net
    tty: true

  database:
    image: neo4j:5.11
    environment:
      - NEO4J_AUTH=${NEO4J_USERNAME-neo4j}/${NEO4J_PASSWORD-password}
      - NEO4J_PLUGINS=["apoc"]
      - NEO4J_db_tx__log_rotation_retention__policy=false
      - NEO4J_dbms_security_procedures_unrestricted=apoc.*
    ports:
      - 7687:7687
      - 7474:7474
    volumes:
      - $PWD/data:/data
    healthcheck:
      test: ["CMD-SHELL", "wget --no-verbose --tries=1 --spider localhost:7474 || exit 1"]
      interval: 15s
      timeout: 30s
      retries: 10
    networks:
      - net

  loader:
    build:
      context: .
      dockerfile: loader.Dockerfile
    environment:
      - NEO4J_URI=${NEO4J_URI-neo4j://database:7687}
      - NEO4J_PASSWORD=${NEO4J_PASSWORD-password}
      - NEO4J_USERNAME=${NEO4J_USERNAME-neo4j}
    networks:
      - net
    depends_on:
      database:
        condition: service_healthy
      pull-model:
        condition: service_completed_successfully
    ports:
      - 8081:8080
      - 8502:8502

  # Additional services omitted for brevity; include bot, pdf_bot, api, and front-end similarly

networks:
  net:

Step 3: Build and Run Services

To build and run the services in your application, execute the following command in your terminal:

docker-compose up --build -d

The --build flag ensures that your services are built from their latest versions, and the -d flag runs them in detached mode.

Step 4: Verifying Service Health

You can verify that your services are running correctly by checking the logs or the health of specific services:

docker-compose logs

To check the health of a particular service, you can run:

docker inspect --format='{{json .State.Health}}' <service_name>

Replace <service_name> with the name of the service you want to inspect, like api or database.

Step 5: Scaling Services

If needed, you can scale specific services using Docker Compose’s scale command. For instance, to scale the API service to three instances, use the following command:

docker-compose up --scale api=3

Ensure that services are stateless or handle state externally to maintain integrity during the scale-out process.

Step 6: Monitoring the Deployment

To monitor your services, consider using tools like Prometheus and Grafana, which can provide insights into each service’s performance and resource utilization.

Step 7: Updating the Deployment

When updates to the application are made, apply the following steps:

Edit your code and save changes.
Build and run the updated services:

docker-compose up --build -d

Monitor the system logs to verify that the deployment is successful.

Conclusion

Following these steps will allow for a reliable production deployment of docker/genai-stack. Ensure to follow best practices for container security and service management.

Source: Docker/GenAI stack configuration detail