> For the complete documentation index, see [llms.txt](https://doc.ainexlayer.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://doc.ainexlayer.com/documentation/troubleshooting/performance-optimization.md). # Performance Optimization ### Overview Performance optimization is crucial for ensuring AINexLayer runs efficiently and provides a good user experience. This guide covers various optimization techniques, from system-level tuning to application-specific improvements. ### System-Level Optimization #### Hardware Requirements **Minimum Requirements** * **CPU**: 4 cores, 2.4 GHz * **RAM**: 8 GB * **Storage**: 50 GB SSD * **Network**: 100 Mbps **Recommended Requirements** * **CPU**: 8 cores, 3.0 GHz * **RAM**: 16 GB * **Storage**: 200 GB NVMe SSD * **Network**: 1 Gbps **Production Requirements** * **CPU**: 16+ cores, 3.5 GHz * **RAM**: 32+ GB * **Storage**: 500+ GB NVMe SSD * **Network**: 10 Gbps #### Operating System Optimization **Linux System Tuning** ```bash # Increase file descriptor limits echo '* soft nofile 65536' >> /etc/security/limits.conf echo '* hard nofile 65536' >> /etc/security/limits.conf # Optimize kernel parameters echo 'vm.swappiness=10' >> /etc/sysctl.conf echo 'vm.dirty_ratio=15' >> /etc/sysctl.conf echo 'vm.dirty_background_ratio=5' >> /etc/sysctl.conf echo 'net.core.rmem_max=16777216' >> /etc/sysctl.conf echo 'net.core.wmem_max=16777216' >> /etc/sysctl.conf sysctl -p # Optimize I/O scheduler echo 'deadline' > /sys/block/sda/queue/scheduler # Enable TCP optimizations echo 'net.ipv4.tcp_congestion_control=bbr' >> /etc/sysctl.conf echo 'net.core.default_qdisc=fq' >> /etc/sysctl.conf ``` **Docker Optimization** ```bash # Optimize Docker daemon cat > /etc/docker/daemon.json << EOF { "log-driver": "json-file", "log-opts": { "max-size": "10m", "max-file": "3" }, "storage-driver": "overlay2", "storage-opts": [ "overlay2.override_kernel_check=true" ], "default-ulimits": { "nofile": { "Name": "nofile", "Hard": 65536, "Soft": 65536 } } } EOF # Restart Docker systemctl restart docker ``` #### Network Optimization **TCP Tuning** ```bash # Optimize TCP settings echo 'net.ipv4.tcp_rmem=4096 65536 16777216' >> /etc/sysctl.conf echo 'net.ipv4.tcp_wmem=4096 65536 16777216' >> /etc/sysctl.conf echo 'net.ipv4.tcp_congestion_control=bbr' >> /etc/sysctl.conf echo 'net.core.rmem_max=16777216' >> /etc/sysctl.conf echo 'net.core.wmem_max=16777216' >> /etc/sysctl.conf sysctl -p ``` **Load Balancer Configuration** ```nginx # nginx.conf optimization worker_processes auto; worker_cpu_affinity auto; worker_rlimit_nofile 65536; events { worker_connections 4096; use epoll; multi_accept on; } http { # Connection optimization keepalive_timeout 65; keepalive_requests 1000; # Buffer optimization client_body_buffer_size 128k; client_max_body_size 100m; client_header_buffer_size 1k; large_client_header_buffers 4 4k; # Gzip compression gzip on; gzip_vary on; gzip_min_length 1024; gzip_comp_level 6; gzip_types text/plain text/css text/xml text/javascript application/json application/javascript application/xml+rss application/atom+xml image/svg+xml; } ``` ### Application-Level Optimization #### Node.js Optimization **Memory Management** ```bash # Optimize Node.js memory export NODE_OPTIONS="--max-old-space-size=4096 --max-semi-space-size=128" # Enable garbage collection optimization export NODE_OPTIONS="$NODE_OPTIONS --optimize-for-size --gc-interval=100" # Enable V8 optimizations export NODE_OPTIONS="$NODE_OPTIONS --harmony --experimental-modules" ``` **Process Management** ```bash # Use PM2 for process management npm install -g pm2 # Create PM2 ecosystem file cat > ecosystem.config.js << EOF module.exports = { apps: [{ name: 'ainexlayer', script: 'index.js', instances: 'max', exec_mode: 'cluster', max_memory_restart: '2G', env: { NODE_ENV: 'production', PORT: 3001 }, env_production: { NODE_ENV: 'production', PORT: 3001 } }] }; EOF # Start with PM2 pm2 start ecosystem.config.js --env production ``` #### Database Optimization **PostgreSQL Tuning** ```sql -- postgresql.conf optimization # Memory settings shared_buffers = 256MB effective_cache_size = 1GB work_mem = 4MB maintenance_work_mem = 64MB # Checkpoint settings checkpoint_completion_target = 0.9 wal_buffers = 16MB default_statistics_target = 100 # Connection settings max_connections = 200 shared_preload_libraries = 'pg_stat_statements' # Logging log_min_duration_statement = 1000 log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ' ``` **Redis Optimization** ```conf # redis.conf optimization # Memory management maxmemory 2gb maxmemory-policy allkeys-lru maxmemory-samples 5 # Persistence optimization save 900 1 save 300 10 save 60 10000 stop-writes-on-bgsave-error yes rdbcompression yes rdbchecksum yes # Network optimization tcp-keepalive 300 timeout 0 # Performance optimization hz 10 dynamic-hz yes ``` #### Vector Database Optimization **LanceDB Optimization** ```json { "lancedb": { "optimization": { "indexType": "IVF_PQ", "indexParams": { "num_partitions": 256, "num_sub_vectors": 16 }, "cacheSize": "1GB", "compression": "lz4" } } } ``` **Pinecone Optimization** ```json { "pinecone": { "optimization": { "indexType": "p1", "replicas": 2, "podType": "p1.x1", "metadataConfig": { "indexed": ["category", "date", "author"] } } } } ``` ### AI Model Optimization #### Model Selection **Performance vs. Quality Trade-offs** ```json { "modelSelection": { "fast": { "llm": "gpt-3.5-turbo", "embedding": "text-embedding-3-small", "useCase": "real-time chat" }, "balanced": { "llm": "gpt-4", "embedding": "text-embedding-3-small", "useCase": "general purpose" }, "quality": { "llm": "gpt-4-turbo", "embedding": "text-embedding-3-large", "useCase": "complex analysis" } } } ``` **Local Model Optimization** ```bash # Ollama optimization export OLLAMA_NUM_PARALLEL=4 export OLLAMA_MAX_LOADED_MODELS=2 export OLLAMA_FLASH_ATTENTION=1 # Model quantization ollama pull llama2:7b-q4_0 ollama pull mistral:7b-instruct-q4_0 ``` #### Embedding Optimization **Batch Processing** ```javascript // Optimize embedding generation const embeddingConfig = { batchSize: 32, concurrency: 4, cache: true, compression: 'gzip' }; // Process documents in batches async function processDocumentsBatch(documents) { const batches = chunk(documents, embeddingConfig.batchSize); for (const batch of batches) { const embeddings = await Promise.all( batch.map(doc => generateEmbedding(doc.text)) ); await storeEmbeddings(batch, embeddings); } } ``` **Caching Strategy** ```javascript // Implement embedding cache const embeddingCache = new Map(); async function getCachedEmbedding(text) { const hash = crypto.createHash('md5').update(text).digest('hex'); if (embeddingCache.has(hash)) { return embeddingCache.get(hash); } const embedding = await generateEmbedding(text); embeddingCache.set(hash, embedding); return embedding; } ``` ### Document Processing Optimization #### Chunking Strategy **Optimal Chunk Sizes** ```json { "chunking": { "text": { "chunkSize": 1000, "chunkOverlap": 200, "strategy": "semantic" }, "code": { "chunkSize": 500, "chunkOverlap": 100, "strategy": "function_boundary" }, "technical": { "chunkSize": 1500, "chunkOverlap": 300, "strategy": "paragraph_boundary" } } } ``` **Parallel Processing** ```javascript // Parallel document processing const workerPool = new WorkerPool(4); async function processDocumentsParallel(documents) { const chunks = chunk(documents, 10); const results = await Promise.all( chunks.map(chunk => workerPool.process(chunk)) ); return results.flat(); } ``` #### OCR Optimization **Image Preprocessing** ```python # Optimize OCR preprocessing import cv2 import numpy as np def preprocess_image(image): # Convert to grayscale gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # Noise reduction denoised = cv2.medianBlur(gray, 3) # Contrast enhancement clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8)) enhanced = clahe.apply(denoised) # Binarization _, binary = cv2.threshold(enhanced, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) return binary ``` **Tesseract Optimization** ```bash # Optimize Tesseract export TESSDATA_PREFIX=/usr/share/tesseract-ocr/4.00/tessdata export OMP_THREAD_LIMIT=4 # Use optimized config tesseract image.png output -c tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz ``` ### Caching Strategies #### Application-Level Caching **Redis Caching** ```javascript // Implement Redis caching const redis = require('redis'); const client = redis.createClient(); // Cache API responses async function getCachedResponse(key, fetchFunction, ttl = 3600) { const cached = await client.get(key); if (cached) { return JSON.parse(cached); } const data = await fetchFunction(); await client.setex(key, ttl, JSON.stringify(data)); return data; } // Cache embeddings async function getCachedEmbedding(text) { const key = `embedding:${crypto.createHash('md5').update(text).digest('hex')}`; return getCachedResponse(key, () => generateEmbedding(text), 86400); } ``` **Memory Caching** ```javascript // Implement in-memory cache const NodeCache = require('node-cache'); const cache = new NodeCache({ stdTTL: 3600, checkperiod: 600, maxKeys: 10000 }); // Cache frequently accessed data function getCachedData(key, fetchFunction) { let data = cache.get(key); if (!data) { data = fetchFunction(); cache.set(key, data); } return data; } ``` #### Database Caching **Query Optimization** ```sql -- Optimize database queries -- Create indexes CREATE INDEX idx_documents_workspace_id ON documents(workspace_id); CREATE INDEX idx_documents_created_at ON documents(created_at); CREATE INDEX idx_chat_messages_conversation_id ON chat_messages(conversation_id); -- Use prepared statements PREPARE get_documents AS SELECT * FROM documents WHERE workspace_id = $1 AND created_at > $2 ORDER BY created_at DESC LIMIT $3; -- Optimize joins EXPLAIN ANALYZE SELECT d.*, w.name as workspace_name FROM documents d JOIN workspaces w ON d.workspace_id = w.id WHERE d.status = 'processed'; ``` **Connection Pooling** ```javascript // Optimize database connections const { Pool } = require('pg'); const pool = new Pool({ host: 'localhost', port: 5432, database: 'ainexlayer', user: 'ainexlayer', password: 'password', max: 20, min: 5, idleTimeoutMillis: 30000, connectionTimeoutMillis: 2000, acquireTimeoutMillis: 60000, createTimeoutMillis: 30000, destroyTimeoutMillis: 5000, reapIntervalMillis: 1000, createRetryIntervalMillis: 200 }); ``` ### Monitoring and Profiling #### Performance Monitoring **Application Metrics** ```javascript // Implement performance monitoring const prometheus = require('prom-client'); // Create metrics const httpRequestDuration = new prometheus.Histogram({ name: 'http_request_duration_seconds', help: 'Duration of HTTP requests in seconds', labelNames: ['method', 'route', 'status'] }); const documentProcessingTime = new prometheus.Histogram({ name: 'document_processing_duration_seconds', help: 'Duration of document processing in seconds', labelNames: ['document_type', 'status'] }); // Middleware to track requests app.use((req, res, next) => { const start = Date.now(); res.on('finish', () => { const duration = (Date.now() - start) / 1000; httpRequestDuration .labels(req.method, req.route?.path || req.path, res.statusCode) .observe(duration); }); next(); }); ``` **System Metrics** ```bash # Monitor system performance #!/bin/bash # monitor.sh while true; do # CPU usage CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1) # Memory usage MEM_USAGE=$(free | grep Mem | awk '{printf("%.2f"), $3/$2 * 100.0}') # Disk usage DISK_USAGE=$(df / | tail -1 | awk '{print $5}' | cut -d'%' -f1) # Network usage NETWORK_IN=$(cat /proc/net/dev | grep eth0 | awk '{print $2}') NETWORK_OUT=$(cat /proc/net/dev | grep eth0 | awk '{print $10}') # Log metrics echo "$(date): CPU: $CPU_USAGE%, Memory: $MEM_USAGE%, Disk: $DISK_USAGE%" sleep 60 done ``` #### Profiling **Node.js Profiling** ```bash # Enable Node.js profiling export NODE_OPTIONS="--prof --prof-process" # Use clinic.js for profiling npm install -g clinic clinic doctor -- node index.js clinic flame -- node index.js clinic bubbleprof -- node index.js ``` **Database Profiling** ```sql -- Enable PostgreSQL profiling -- Enable query logging ALTER SYSTEM SET log_statement = 'all'; ALTER SYSTEM SET log_min_duration_statement = 1000; SELECT pg_reload_conf(); -- Analyze slow queries SELECT query, mean_time, calls, total_time FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10; ``` ### Scaling Strategies #### Horizontal Scaling **Load Balancing** ```nginx # nginx load balancer configuration upstream ainexlayer_backend { least_conn; server ainexlayer-1:3001 max_fails=3 fail_timeout=30s; server ainexlayer-2:3001 max_fails=3 fail_timeout=30s; server ainexlayer-3:3001 max_fails=3 fail_timeout=30s; keepalive 32; } server { listen 80; location / { proxy_pass http://ainexlayer_backend; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_cache_bypass $http_upgrade; } } ``` **Container Orchestration** ```yaml # docker-compose scaling version: '3.8' services: ainexlayer: image: alakinfotech/ainexlayer:latest deploy: replicas: 3 resources: limits: memory: 4G cpus: '2.0' reservations: memory: 2G cpus: '1.0' restart_policy: condition: on-failure delay: 5s max_attempts: 3 environment: - NODE_ENV=production - REDIS_URL=redis://redis:6379 - DATABASE_URL=postgresql://user:pass@postgres:5432/ainexlayer ``` #### Vertical Scaling **Resource Optimization** ```bash # Optimize container resources docker run \ --cpus=4 \ --memory=8g \ --memory-swap=8g \ --ulimit nofile=65536:65536 \ alakinfotech/ainexlayer:latest ``` **Database Scaling** ```sql -- PostgreSQL scaling -- Read replicas CREATE PUBLICATION main_publication FOR ALL TABLES; -- On replica CREATE SUBSCRIPTION main_subscription CONNECTION 'host=master port=5432 user=replicator dbname=ainexlayer' PUBLICATION main_publication; -- Connection pooling -- Use pgbouncer for connection pooling [databases] ainexlayer = host=localhost port=5432 dbname=ainexlayer [pgbouncer] pool_mode = transaction max_client_conn = 1000 default_pool_size = 25 ``` ### Best Practices #### Development Best Practices **Code Optimization** ```javascript // Optimize async operations async function processDocuments(documents) { // Use Promise.all for parallel processing const results = await Promise.all( documents.map(doc => processDocument(doc)) ); return results; } // Implement proper error handling async function safeProcessDocument(doc) { try { return await processDocument(doc); } catch (error) { console.error(`Failed to process document ${doc.id}:`, error); return null; } } // Use streaming for large datasets const stream = fs.createReadStream('large-file.txt'); const processedStream = stream.pipe(transformStream); ``` **Memory Management** ```javascript // Implement proper memory management class DocumentProcessor { constructor() { this.cache = new Map(); this.maxCacheSize = 1000; } processDocument(doc) { // Check cache size if (this.cache.size >= this.maxCacheSize) { // Remove oldest entries const oldestKey = this.cache.keys().next().value; this.cache.delete(oldestKey); } // Process document const result = this.process(doc); this.cache.set(doc.id, result); return result; } } ``` #### Production Best Practices **Monitoring and Alerting** ```bash # Set up monitoring alerts #!/bin/bash # alert.sh # Check CPU usage CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1) if (( $(echo "$CPU_USAGE > 80" | bc -l) )); then curl -X POST "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK" \ -H 'Content-type: application/json' \ --data "{\"text\":\"High CPU usage: $CPU_USAGE%\"}" fi # Check memory usage MEM_USAGE=$(free | grep Mem | awk '{printf("%.2f"), $3/$2 * 100.0}') if (( $(echo "$MEM_USAGE > 80" | bc -l) )); then curl -X POST "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK" \ -H 'Content-type: application/json' \ --data "{\"text\":\"High memory usage: $MEM_USAGE%\"}" fi ``` **Backup and Recovery** ```bash # Automated backup script #!/bin/bash # backup.sh BACKUP_DIR="/backups/$(date +%Y%m%d_%H%M%S)" mkdir -p $BACKUP_DIR # Backup database docker exec postgres pg_dump -U ainexlayer ainexlayer > $BACKUP_DIR/database.sql # Backup application data tar -czf $BACKUP_DIR/storage.tar.gz ./storage/ # Backup configuration cp .env $BACKUP_DIR/ cp docker-compose.yml $BACKUP_DIR/ # Upload to cloud storage aws s3 sync $BACKUP_DIR s3://your-backup-bucket/ # Clean up old backups find /backups -type d -mtime +7 -exec rm -rf {} \; ``` *** **⚡ Performance optimization is an ongoing process. Monitor your system regularly and adjust configurations based on usage patterns and performance metrics.** --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://doc.ainexlayer.com/documentation/troubleshooting/performance-optimization.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.