Troubleshooting Common Issues
Solutions to the most common problems when deploying AI models to edge devices with Cliff.
Device Connection Issues
Device Not Appearing in Dashboard
Symptoms: Device doesn't show up after registration
Solutions:
- Check agent status:
sudo systemctl status cliff-agent - Verify provision key is correct
- Check network connectivity:
ping api.trycliff.com - Review agent logs:
sudo journalctl -u cliff-agent -n 50 - Ensure firewall allows outbound HTTPS (port 443)
Device Disconnects Frequently
Symptoms: Device shows as disconnected intermittently
Solutions:
- Check network stability
- Verify device has sufficient resources
- Review system logs for errors
- Check agent configuration
- Ensure device time is synchronized (NTP)
Deployment Issues
Deployment Fails
Symptoms: Deployment shows as failed or stuck
Solutions:
- Check device logs in dashboard
- Verify model format is supported
- Ensure device has sufficient storage
- Check resource limits (CPU, memory)
- Review model requirements vs device capabilities
Model Not Loading
Symptoms: Model uploads but won't deploy
Solutions:
- Verify model format (ONNX, TensorFlow, etc.)
- Check model file integrity
- Ensure model is compatible with device architecture
- Review model size vs available memory
- Check model input/output specifications
Performance Issues
Slow Inference
Symptoms: High latency, slow response times
Solutions:
- Optimize model (quantization, pruning)
- Reduce input image size
- Enable GPU if available
- Check CPU/memory usage
- Review preprocessing steps
- Consider model architecture (use smaller variant)
High Memory Usage
Symptoms: Out of memory errors, device crashes
Solutions:
- Reduce batch size
- Use model quantization
- Optimize preprocessing
- Increase device memory if possible
- Use memory-efficient model variants
Model Issues
Incorrect Predictions
Symptoms: Model produces wrong results
Solutions:
- Verify input preprocessing matches training
- Check input data format and range
- Ensure model version is correct
- Validate model was exported correctly
- Test with known inputs
- Check for data drift
Model Version Mismatch
Symptoms: Unexpected behavior after update
Solutions:
- Verify correct model version is deployed
- Check model metadata
- Review deployment logs
- Rollback to previous version if needed
- Compare model configurations
Network Issues
API Timeouts
Symptoms: Requests timeout, connection errors
Solutions:
- Check network connectivity
- Verify API endpoint is accessible
- Review firewall rules
- Check DNS resolution
- Test with curl/wget
Slow Data Transfer
Symptoms: Slow model uploads/downloads
Solutions:
- Check network bandwidth
- Use compression for large files
- Consider CDN for model distribution
- Verify device network speed
- Check for network congestion
Getting Help
If you're still experiencing issues:
- Check Documentation: Review our guides and tutorials
- Search Knowledge Base: Look for similar issues
- Review Logs: Check device and application logs
- Contact Support: Reach out with:
- Error messages
- Log excerpts
- Steps to reproduce
- Device specifications
Prevention
To avoid common issues:
- Test models locally before deployment
- Monitor device health regularly
- Keep agent software updated
- Use health checks and monitoring
- Follow best practices guides
- Document your configurations
